Herman J. Radtke IIIZola2024-01-14T00:00:00+00:00https://hermanradtke.com/atom.xmlhttp://activitystrea.ms/schema/1.0/postDeveloper Experience: Fast Startup Is Not The Only Speed Metric2024-01-14T00:00:00+00:002024-01-14T00:00:00+00:00Herman J. Radtke IIIhttp://activitystrea.ms/schema/1.0/personhttps://hermanradtke.com/about/https://hermanradtke.com/developer-experience-fast-startup-is-not-the-only-speed-metric/<p>I was reading <a href="https://registerspill.thorstenball.com/p/how-fast-is-your-shell">How fast is your shell?</a> and it got me thinking about the difference types of speed developers prioritize. A lot of articles and advice focus on initial startup, which can be thought of as <em>acceleration</em>. We should also consider, what happens after we are done accelerating, how easy it is to maintain our velocity.</p>
<p>I was reading <a href="https://registerspill.thorstenball.com/p/how-fast-is-your-shell">How fast is your shell?</a> and it got me thinking about the difference types of speed developers prioritize. A lot of articles and advice focus on initial startup, which can be thought of as <em>acceleration</em>. We should also consider, what happens after we are done accelerating, how easy it is to maintain our velocity.</p>
<span id="continue-reading"></span>
<p>Look, I am biased towards fast startup times. Here is my zsh startup time:</p>
<pre style="background-color:#2b303b;color:#c0c5ce;"><code><span>$ time zsh -i -c exit
</span><span>zsh -i -c exit 0.04s user 0.02s system 94% cpu 0.063 total
</span><span>$ time zsh -i -c exit
</span><span>zsh -i -c exit 0.04s user 0.02s system 95% cpu 0.066 total
</span><span>$ time zsh -i -c exit
</span><span>zsh -i -c exit 0.04s user 0.02s system 93% cpu 0.067 total
</span><span>$ time zsh -i -c exit
</span><span>zsh -i -c exit 0.04s user 0.02s system 94% cpu 0.069 total
</span><span>$ time zsh -i -c exit
</span><span>zsh -i -c exit 0.04s user 0.02s system 93% cpu 0.065 total
</span></code></pre>
<p>I am open and close my my shell quite often. As a result, my shell configuration is fairly minimal. I generally omit features, like custom completions, to achieve this. This is not the only way people work.</p>
<p>When I pair with some of my colleagues, I observe that they have a different style of working. They open VS Code and then open a single shell within the VS Code UI. They usually leave VS Code open all day and use that one shell. The initial startup time of VS Code and the the shell within VS Code is not that important to them. What is important to them is how easily they can maintain their productivity level of the course of the day.</p>
<p>People that have a long-loved shell, are probably better off adding additional functionality, such as custom completions. These additional features help maintain their productivity velocity. My suggestions to them about improving their shell startup time will often seem unimportantto them. They are optimizing for a different outcome than I am.</p>
<p>A while back, I switched from vim to neovim. Prior to the switch, I kept my vim configuration pretty minimal to avoid slowing down vim's startup time. I sacraficed some of the more advanced, heavier plugins as a result. I was fine with this trade-off. Neovim makes it much easier to lazy-load plugins, which allows me to maintain a fast startup time while also providing me access to features, such as languages servers.</p>
<p>I am not lazy loading anything in zsh. Maybe I should? I found <a href="https://github.com/qoomon/zsh-lazyload">zsh-lazyload</a> from a quick Google search, but I have not looked into it closer. I know fish has <a href="https://fishshell.com/docs/current/tutorial.html#autoloading-functions">autoloading function</a> which is basically lazy loading. The downside of lazy loading approaches is the additional complexity. I will only consider lazy loading things in my shell if it is relatively straight-forward and a first-class concern of the shell.</p>
<p>I like the <a href="https://registerspill.thorstenball.com/p/how-fast-is-your-shell">How fast is your shell?</a> article because it matches my own development style. I also try to keep in mind that others have styles of working that require a different set of optimizations. Ideally, we can be creative enough to get fast startup times and access to all the features we want using concepts like lazy-loading.</p>
http://activitystrea.ms/schema/1.0/postProving getenv Does Not Make a Syscall2024-01-09T00:00:00+00:002024-01-09T00:00:00+00:00Herman J. Radtke IIIhttp://activitystrea.ms/schema/1.0/personhttps://hermanradtke.com/about/https://hermanradtke.com/proving-getenv-does-not-make-syscall/<p>I saw this statement in an accepted answer on Stack Overflow:</p>
<blockquote>
<p>Retrieving the value of an environment variable will incur a system call.</p>
</blockquote>
<p><a href="https://stackoverflow.com/a/7460584/775246">source</a></p>
<p>This answer surprised me as I did not think this was the case. There is an edit farther down that has links to other Stack Overflow posts saying get <code>getenv</code> does not make a syscall. Let us prove it ourselves.</p>
<p>I saw this statement in an accepted answer on Stack Overflow:</p>
<blockquote>
<p>Retrieving the value of an environment variable will incur a system call.</p>
</blockquote>
<p><a href="https://stackoverflow.com/a/7460584/775246">source</a></p>
<p>This answer surprised me as I did not think this was the case. There is an edit farther down that has links to other Stack Overflow posts saying get <code>getenv</code> does not make a syscall. Let us prove it ourselves.</p>
<span id="continue-reading"></span>
<p>Here is a small program that reads an environment variable and prints out the value.</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">main</span><span>() {
</span><span> </span><span style="color:#b48ead;">let</span><span> v = ::std::env::var("</span><span style="color:#a3be8c;">USER</span><span>").</span><span style="color:#96b5b4;">unwrap</span><span>();
</span><span> println!("</span><span style="color:#a3be8c;">USER: </span><span style="color:#d08770;">{}</span><span>", v);
</span><span>}
</span></code></pre>
<pre style="background-color:#2b303b;color:#c0c5ce;"><code><span>$ cargo build
</span><span>$ ./target/debug/env-syscall-test
</span><span>USER: herman
</span></code></pre>
<p>Now we can use <code>dtruss</code> to print out all the syscalls our program makes.</p>
<pre style="background-color:#2b303b;color:#c0c5ce;"><code><span>$ sudo dtruss ./target/debug/env-syscall-test
</span><span>dtrace: system integrity protection is on, some features will not be available
</span><span>
</span><span>SYSCALL(args) = return
</span><span>USER: root
</span><span>access("/AppleInternal/XBS/.isChrooted\0", 0x0, 0x0) = -1 2
</span><span>...elided
</span></code></pre>
<p>The <code>dtruss</code> output shows that our program prints <code>USER: root</code> without making any syscall.</p>
http://activitystrea.ms/schema/1.0/postProfiling Node.js in Production2024-01-07T00:00:00+00:002024-01-07T00:00:00+00:00Herman J. Radtke IIIhttp://activitystrea.ms/schema/1.0/personhttps://hermanradtke.com/about/https://hermanradtke.com/profiling-node-in-production/<p>At work, I lead a team responsible for a Node.js service that serves a lot of GraphQL queries. We recently noticed some servers in the cluster were running much slower than others. We had used <a href="https://github.com/davidmarkclements/0x">0x</a> in the past to profile Node.js services locally. In this case, we could not identify the problem locally and needed a solution to profile Node.js in production to identify the cause of the slowdown.</p>
<p>At work, I lead a team responsible for a Node.js service that serves a lot of GraphQL queries. We recently noticed some servers in the cluster were running much slower than others. We had used <a href="https://github.com/davidmarkclements/0x">0x</a> in the past to profile Node.js services locally. In this case, we could not identify the problem locally and needed a solution to profile Node.js in production to identify the cause of the slowdown.</p>
<span id="continue-reading"></span>
<p>To profile in production, I wanted to expose a route that would enable profiling for a short amount of time and then give us access to the results. Working with another engineer, we decided on the following requirements:</p>
<ul>
<li>the route would only be accessible through an internal port because our application was internet facing</li>
<li>the data would be returned as part of the route when the profiling was finished</li>
</ul>
<details>
<summary>Node and package versions used in this article</summary>
<ul>
<li>node - 18.12.0</li>
<li>express - 4.18.2</li>
<li>v8-profiler-next - 1.10.0</li>
</ul>
</details>
<h2 id="setup"><a class="zola-anchor" href="#setup" aria-label="Anchor link for: setup">Setup</a></h2>
<p>Let us create a simple express app that represents the Node.js service we want to profile.</p>
<pre data-lang="js" style="background-color:#2b303b;color:#c0c5ce;" class="language-js "><code class="language-js" data-lang="js"><span style="color:#b48ead;">import </span><span style="color:#bf616a;">express </span><span style="color:#b48ead;">from </span><span>"</span><span style="color:#a3be8c;">express</span><span>";
</span><span>
</span><span style="color:#b48ead;">const </span><span style="color:#bf616a;">app </span><span>= </span><span style="color:#8fa1b3;">express</span><span>();
</span><span>
</span><span style="color:#bf616a;">app</span><span>.</span><span style="color:#96b5b4;">get</span><span>("</span><span style="color:#a3be8c;">/</span><span>", (</span><span style="color:#bf616a;">req</span><span>, </span><span style="color:#bf616a;">res</span><span>) </span><span style="color:#b48ead;">=> </span><span>{
</span><span> </span><span style="color:#bf616a;">res</span><span>.</span><span style="color:#96b5b4;">send</span><span>("</span><span style="color:#a3be8c;">hello world</span><span>");
</span><span>});
</span><span>
</span><span style="color:#bf616a;">app</span><span>.</span><span style="color:#8fa1b3;">listen</span><span>(</span><span style="color:#d08770;">8000</span><span>);
</span></code></pre>
<h2 id="internally-enable-profiling-at-runtime"><a class="zola-anchor" href="#internally-enable-profiling-at-runtime" aria-label="Anchor link for: internally-enable-profiling-at-runtime">Internally Enable Profiling at Runtime</a></h2>
<p>I use <code>node --prof /path/to/main.js</code> when profiling locally. I use <a href="https://github.com/davidmarkclements/0x">0x</a> which calls the application using the <code>--prof</code> flag and automatically generates a flamegraph. The problem is that we want to profile a running service in production for a few seconds.</p>
<p>We found two packages on npm that can enable profiling at runtime</p>
<ul>
<li><a href="https://github.com/hyj1991/v8-profiler-next">hyj1991/v8-profiler-next</a></li>
<li><a href="https://github.com/node-inspector/v8-profiler">node-inspector/v8-profiler</a></li>
</ul>
<p>After reading <a href="https://github.com/node-inspector/v8-profiler/issues/137">https://github.com/node-inspector/v8-profiler/issues/137</a>, we chose <code>v8-profiler-next</code> because our Node.js is running node v20.</p>
<p>Our production Node.js service is internet facing. We only want our new profiling route available on our internal VPN. Our example express app listens for requests on port <code>8000</code>, which we can pretend is our public port. We create a separate express app listening on port <code>8001</code>.</p>
<pre data-lang="js" style="background-color:#2b303b;color:#c0c5ce;" class="language-js "><code class="language-js" data-lang="js"><span style="color:#b48ead;">import </span><span style="color:#bf616a;">v8Profiler </span><span style="color:#b48ead;">from </span><span>"</span><span style="color:#a3be8c;">v8-profiler-next</span><span>";
</span><span>
</span><span style="color:#bf616a;">v8Profiler</span><span>.</span><span style="color:#8fa1b3;">setGenerateType</span><span>(</span><span style="color:#d08770;">1</span><span>);
</span><span style="color:#b48ead;">const </span><span style="color:#bf616a;">mgmt </span><span>= </span><span style="color:#8fa1b3;">express</span><span>();
</span><span>
</span><span style="color:#bf616a;">mgmt</span><span>.</span><span style="color:#96b5b4;">get</span><span>("</span><span style="color:#a3be8c;">/profile</span><span>", (</span><span style="color:#bf616a;">req</span><span>, </span><span style="color:#bf616a;">res</span><span>) </span><span style="color:#b48ead;">=> </span><span>{
</span><span> </span><span style="color:#b48ead;">const </span><span style="color:#bf616a;">timeoutMs </span><span>= </span><span style="color:#bf616a;">req</span><span>.</span><span style="color:#bf616a;">query</span><span>.</span><span style="color:#bf616a;">timeout </span><span>|| </span><span style="color:#d08770;">1000</span><span>;
</span><span>
</span><span> </span><span style="color:#b48ead;">const </span><span style="color:#bf616a;">title </span><span>= `</span><span style="color:#a3be8c;">myapp-${</span><span style="color:#ebcb8b;">Date</span><span style="color:#a3be8c;">.</span><span style="color:#8fa1b3;">now</span><span style="color:#a3be8c;">()}.cpuprofile</span><span>`;
</span><span> </span><span style="color:#bf616a;">v8Profiler</span><span>.</span><span style="color:#8fa1b3;">startProfiling</span><span>(</span><span style="color:#bf616a;">title</span><span>);
</span><span> </span><span style="color:#96b5b4;">setTimeout</span><span>(() </span><span style="color:#b48ead;">=> </span><span>{
</span><span> </span><span style="color:#b48ead;">const </span><span style="color:#bf616a;">profile </span><span>= </span><span style="color:#bf616a;">v8Profiler</span><span>.</span><span style="color:#8fa1b3;">stopProfiling</span><span>(</span><span style="color:#bf616a;">title</span><span>);
</span><span> </span><span style="color:#bf616a;">profile</span><span>.</span><span style="color:#8fa1b3;">export</span><span>(</span><span style="color:#b48ead;">function </span><span>(</span><span style="color:#bf616a;">error</span><span>, </span><span style="color:#bf616a;">result</span><span>) {
</span><span> </span><span style="color:#bf616a;">res</span><span>.</span><span style="color:#8fa1b3;">attachment</span><span>(</span><span style="color:#bf616a;">title</span><span>);
</span><span> </span><span style="color:#bf616a;">res</span><span>.</span><span style="color:#96b5b4;">send</span><span>(</span><span style="color:#bf616a;">result</span><span>);
</span><span> </span><span style="color:#bf616a;">profile</span><span>.</span><span style="color:#96b5b4;">delete</span><span>();
</span><span> });
</span><span> }, </span><span style="color:#bf616a;">timeoutMs</span><span>);
</span><span>});
</span><span>
</span><span style="color:#bf616a;">mgmt</span><span>.</span><span style="color:#8fa1b3;">listen</span><span>(</span><span style="color:#d08770;">8001</span><span>);
</span></code></pre>
<details>
<summary>Entire main.js file</summary>
<pre data-lang="js" style="background-color:#2b303b;color:#c0c5ce;" class="language-js "><code class="language-js" data-lang="js"><span style="color:#b48ead;">import </span><span style="color:#bf616a;">express </span><span style="color:#b48ead;">from </span><span>"</span><span style="color:#a3be8c;">express</span><span>";
</span><span style="color:#b48ead;">import </span><span style="color:#bf616a;">v8Profiler </span><span style="color:#b48ead;">from </span><span>"</span><span style="color:#a3be8c;">v8-profiler-next</span><span>";
</span><span>
</span><span style="color:#b48ead;">const </span><span style="color:#bf616a;">app </span><span>= </span><span style="color:#8fa1b3;">express</span><span>();
</span><span>
</span><span style="color:#bf616a;">app</span><span>.</span><span style="color:#96b5b4;">get</span><span>("</span><span style="color:#a3be8c;">/</span><span>", (</span><span style="color:#bf616a;">req</span><span>, </span><span style="color:#bf616a;">res</span><span>) </span><span style="color:#b48ead;">=> </span><span>{
</span><span> </span><span style="color:#bf616a;">res</span><span>.</span><span style="color:#96b5b4;">send</span><span>("</span><span style="color:#a3be8c;">hello world</span><span>");
</span><span>});
</span><span>
</span><span style="color:#bf616a;">app</span><span>.</span><span style="color:#8fa1b3;">listen</span><span>(</span><span style="color:#d08770;">8000</span><span>);
</span><span>
</span><span style="color:#bf616a;">v8Profiler</span><span>.</span><span style="color:#8fa1b3;">setGenerateType</span><span>(</span><span style="color:#d08770;">1</span><span>);
</span><span style="color:#b48ead;">const </span><span style="color:#bf616a;">mgmt </span><span>= </span><span style="color:#8fa1b3;">express</span><span>();
</span><span>
</span><span style="color:#bf616a;">mgmt</span><span>.</span><span style="color:#96b5b4;">get</span><span>("</span><span style="color:#a3be8c;">/profile</span><span>", (</span><span style="color:#bf616a;">req</span><span>, </span><span style="color:#bf616a;">res</span><span>) </span><span style="color:#b48ead;">=> </span><span>{
</span><span> </span><span style="color:#b48ead;">const </span><span style="color:#bf616a;">timeoutMs </span><span>= </span><span style="color:#bf616a;">req</span><span>.</span><span style="color:#bf616a;">query</span><span>.</span><span style="color:#bf616a;">timeout </span><span>|| </span><span style="color:#d08770;">1000</span><span>;
</span><span>
</span><span> </span><span style="color:#b48ead;">const </span><span style="color:#bf616a;">title </span><span>= `</span><span style="color:#a3be8c;">myapp-${</span><span style="color:#ebcb8b;">Date</span><span style="color:#a3be8c;">.</span><span style="color:#8fa1b3;">now</span><span style="color:#a3be8c;">()}.cpuprofile</span><span>`;
</span><span> </span><span style="color:#bf616a;">v8Profiler</span><span>.</span><span style="color:#8fa1b3;">startProfiling</span><span>(</span><span style="color:#bf616a;">title</span><span>);
</span><span> </span><span style="color:#96b5b4;">setTimeout</span><span>(() </span><span style="color:#b48ead;">=> </span><span>{
</span><span> </span><span style="color:#b48ead;">const </span><span style="color:#bf616a;">profile </span><span>= </span><span style="color:#bf616a;">v8Profiler</span><span>.</span><span style="color:#8fa1b3;">stopProfiling</span><span>(</span><span style="color:#bf616a;">title</span><span>);
</span><span> </span><span style="color:#bf616a;">profile</span><span>.</span><span style="color:#8fa1b3;">export</span><span>(</span><span style="color:#b48ead;">function </span><span>(</span><span style="color:#bf616a;">error</span><span>, </span><span style="color:#bf616a;">result</span><span>) {
</span><span> </span><span style="color:#bf616a;">res</span><span>.</span><span style="color:#8fa1b3;">attachment</span><span>(</span><span style="color:#bf616a;">title</span><span>);
</span><span> </span><span style="color:#bf616a;">res</span><span>.</span><span style="color:#96b5b4;">send</span><span>(</span><span style="color:#bf616a;">result</span><span>);
</span><span> </span><span style="color:#bf616a;">profile</span><span>.</span><span style="color:#96b5b4;">delete</span><span>();
</span><span> });
</span><span> }, </span><span style="color:#bf616a;">timeoutMs</span><span>);
</span><span>});
</span><span>
</span><span style="color:#bf616a;">mgmt</span><span>.</span><span style="color:#8fa1b3;">listen</span><span>(</span><span style="color:#d08770;">8001</span><span>);
</span></code></pre>
</details>
<p>We set <code>v8Profiler.setGenerateType(1);</code> to use the new <em>tree</em> profiling format. Most modern tooling that analyzes CPU profiles prefer this format.</p>
<p>The <code>/profile</code> route will enable profiling and then return the output. We allow a <code>timeout</code> query parameter to control how long the profiling would run. We want to minimize the amount of time we profile for two reasons:</p>
<ul>
<li>profiling slows the service down</li>
<li>the output can get quite large</li>
</ul>
<p>Our production service receives a lot of traffic, so a few seconds of profile output was more than enough to start analyzing.</p>
<p>We give each profile a name so we do not get them confused. Our example app uses <code>myapp-${Date.now()}.cpuprofile</code>. You may also consider including the host name in the filename if available. The <code>.cpuprofile</code> extension is the convention for profiling output.</p>
<p>After the profile is complete, the profile output is sent back in the response. We use <code>res.attachment</code> to add the <a href="https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Content-Disposition">Content-Disposition</a> header to the response to indicate the content should be downloaded and saved locally.</p>
<h2 id="profiling-the-service"><a class="zola-anchor" href="#profiling-the-service" aria-label="Anchor link for: profiling-the-service">Profiling the Service</a></h2>
<p>We first start our Node.js application using <code>node main.js</code>.</p>
<p>We can call our application public routes on port <code>8000</code></p>
<pre data-lang="shell" style="background-color:#2b303b;color:#c0c5ce;" class="language-shell "><code class="language-shell" data-lang="shell"><span>$ curl localhost:8000/
</span><span>hello world%
</span></code></pre>
<p>We can profile our application using port <code>8001</code></p>
<pre style="background-color:#2b303b;color:#c0c5ce;"><code><span>$ curl localhost:8001/profile --remote-name --remote-header-name --silent
</span><span>$ ls myapp-1704639385130.cpuprofile
</span><span>myapp-1704639385130.cpuprofile
</span></code></pre>
<ul>
<li>The <code>--remote-name</code> option instructs <code>curl</code> to save that data into a local file instead of writing to <code>stdoout</code>.</li>
<li>The <code>--remote-header-name</code> option instructs <code>curl</code> to use the <code>Content-Disposition</code> filename instead of extracting a filename from the URL.</li>
<li>The <code>--silent</code> option instructs <code>curl</code> to disable the progress meter. This is my personal preference and not required.</li>
</ul>
<h3 id="internal-port-in-production"><a class="zola-anchor" href="#internal-port-in-production" aria-label="Anchor link for: internal-port-in-production">Internal Port in Production</a></h3>
<p>In order to use profile in production, we need to:</p>
<ul>
<li>identify the server we want to profile</li>
<li>get network access to send a request to that server</li>
</ul>
<p>In our case, our Node.js service emits telemetry that is aggregated into dashboards. Using these dashboards, we were able to identify which servers were using more CPU than the others.</p>
<p>Once we identified the server, we were able to vpn into the production network and use some <a href="https://kubernetes.io/docs/reference/kubectl/">kubectl</a> commands to send requests a specific server on a specific port.</p>
<h2 id="analyzing-the-cpu-profile-using-flamegraphs"><a class="zola-anchor" href="#analyzing-the-cpu-profile-using-flamegraphs" aria-label="Anchor link for: analyzing-the-cpu-profile-using-flamegraphs">Analyzing the CPU Profile Using Flamegraphs</a></h2>
<p>We spent some time searching for the best way to view the results as a <a href="https://www.brendangregg.com/flamegraphs.html">flamegraph</a>. I was used to <a href="https://github.com/davidmarkclements/0x">0x</a> handling this by default. I liked the output of <a href="https://github.com/davidmarkclements/0x">0x</a> but I did not want to hack the code to render the results for an external file.</p>
<p>A lot of the suggestions I read said to use Chrome's <code>Profile</code> tab. Newer versions of Chrome no longer have this tab and I could not get the <code>Performance Insights</code> tab render my <code>.cpuprofile</code> files. Other suggestions were to manually convert the file into an svg image. A svg file is fine, but I wanted something a little better.</p>
<p>I tried <a href="https://github.com/thlorenz/flamegraph">thlorenz/flamegraph</a> but I received an error when I tried to use it and gave up after a few minutes.</p>
<p>I happened to stumble upon <a href="https://github.com/jlfwong/speedscope">jlfwong/speedscope</a> which was exactly what I was looking for. It is easy to internally install and use. If you are working on open source you can use <a href="https://www.speedscope.app/">https://www.speedscope.app/</a>.</p>
<h2 id="conclusion"><a class="zola-anchor" href="#conclusion" aria-label="Anchor link for: conclusion">Conclusion</a></h2>
<p>Once we were able to view the <code>.cpuprofile</code> flamegraph's we quickly identified that our <a href="https://www.apollographql.com/docs/apollo-server/data/subscriptions">Apollo Server - Subscription</a> implementation for was the culprit. We were using <a href="https://github.com/davidyaha/graphql-redis-subscriptions">davidyaha/graphql-redis-subscriptions</a> to load balance subscriptions across the cluster and it was using a lot of CPU, possibly due to the use of <a href="https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/AsyncGenerator">async generators</a> which have had known performance issues. We are using node v20 in production and supposedly those performance are fixed.</p>
<p>The issue was that when a particular server handled too many subscriptions, then it slowed way down. We are still investigating and changed implementations in the meantime. The cluster is now performing much better.</p>
<h2 id="addendum"><a class="zola-anchor" href="#addendum" aria-label="Anchor link for: addendum">Addendum</a></h2>
<h3 id="startprofiling-options"><a class="zola-anchor" href="#startprofiling-options" aria-label="Anchor link for: startprofiling-options">startProfiling Options</a></h3>
<p>I noticed the <code>startProfiling</code> function accepts three parameters:</p>
<pre data-lang="typescript" style="background-color:#2b303b;color:#c0c5ce;" class="language-typescript "><code class="language-typescript" data-lang="typescript"><span style="color:#b48ead;">export function </span><span style="color:#8fa1b3;">startProfiling</span><span>(</span><span style="color:#bf616a;">name</span><span>?: string, </span><span style="color:#bf616a;">recsamples</span><span>?: boolean, </span><span style="color:#bf616a;">mode</span><span>?: </span><span style="color:#d08770;">0 </span><span>| </span><span style="color:#d08770;">1</span><span>): void;
</span></code></pre>
<p>I could not find any documentation on what the <code>recsamples</code> and <code>mode</code> options did though. The default value for <code>recsamples</code> is <code>true</code> and for <code>mode</code> is 0.</p>
<p>I dug through the code and eventually found the answer to <code>recsamples</code> in <a href="https://github.com/v8/v8/blob/10.1.10/include/v8-profiler.h#L387-L388">v8-profiler.h</a> which says</p>
<blockquote>
<p>|record_samples| parameter controls whether individual samples should be recorded in addition to the aggregated tree</p>
</blockquote>
<p>I also found the answer to <code>mode</code> in <a href="https://github.com/hyj1991/v8-profiler-next/blob/ba0b6b9c46b6469466da5e995b7cb4099de1a5c1/src/cpu_profiler/cpu_profiler.cc#L68-L75">cpu-profiler.cc</a> which toggle for eager vs lazy logging.</p>
http://activitystrea.ms/schema/1.0/postStream a Body With Trailers in axum 0.62023-10-19T00:00:00+00:002023-10-19T00:00:00+00:00Herman J. Radtke IIIhttp://activitystrea.ms/schema/1.0/personhttps://hermanradtke.com/about/https://hermanradtke.com/streaming-body-trailing-headers-axum-0-6/<p>Hyper is designed to support streaming bodies. The current version of axum, v0.6, supports streaming a response. If we want to include <a href="https://datatracker.ietf.org/doc/html/rfc7230#section-4.4">trailers</a> (sometimes called "trailing headers") then we need to implement our own custom body.</p>
<p>Hyper is designed to support streaming bodies. The current version of axum, v0.6, supports streaming a response. If we want to include <a href="https://datatracker.ietf.org/doc/html/rfc7230#section-4.4">trailers</a> (sometimes called "trailing headers") then we need to implement our own custom body.</p>
<span id="continue-reading"></span>
<p>Caveats:</p>
<ul>
<li>The custom body implementation only works in axum 0.6, which uses http-body 0.4.4. The http-body crate changed in v1.0.0-rc.2. The concept is the same, but the custom <code>StreamBody</code> type will be different.</li>
<li>Trailers are only supported in hyper using HTTP/2. You can monitor https://github.com/hyperium/hyper/issues/2719 for HTTP/1.1 support.</li>
</ul>
<h3 id="set-up"><a class="zola-anchor" href="#set-up" aria-label="Anchor link for: set-up">Set up</a></h3>
<p>In order to send trailers, we need an axum server that uses HTTP/2. Also, most implementations of HTTP/2 require TLS. Let us start from <a href="https://github.com/tokio-rs/axum/tree/1e5be5bb693f825ece664518f3aa6794f03bfec6/examples/tls-rustls">axum/examples/tls-rustlls</a>. This will give us a working HTTP/2 server that uses self-signed TLS certificates.</p>
<p>We need to make a few changes to the <code>Cargo.toml</code> in order for the example to work:</p>
<pre data-lang="diff" style="background-color:#2b303b;color:#c0c5ce;" class="language-diff "><code class="language-diff" data-lang="diff"><span> [package]
</span><span style="color:#bf616a;">-name = "example-tls-rustls"
</span><span style="color:#a3be8c;">+name = "axum-trailers"
</span><span> version = "0.1.0"
</span><span> edition = "2021"
</span><span> publish = false
</span><span>
</span><span> [dependencies]
</span><span style="color:#bf616a;">-axum = { path = "../../axum" }
</span><span style="color:#a3be8c;">+axum = { version = "0.6.20", features = ["http2"] }
</span><span> axum-server = { version = "0.3", features = ["tls-rustls"] }
</span></code></pre>
<p>We can now verify our server working:</p>
<pre style="background-color:#2b303b;color:#c0c5ce;"><code><span>$ cargo run
</span><span> Compiling axum-trailers v0.1.0 (/Users/herman/Code/axum-trailers)
</span><span> Finished dev [unoptimized + debuginfo] target(s) in 3.65s
</span><span> Running `target/debug/axum-trailers`
</span></code></pre>
<pre style="background-color:#2b303b;color:#c0c5ce;"><code><span>$ curl -k https://localhost:3000
</span><span>Hello, World!%
</span></code></pre>
<h3 id="streaming-body"><a class="zola-anchor" href="#streaming-body" aria-label="Anchor link for: streaming-body">Streaming Body</a></h3>
<p>Before sending trailers, we need to change our <code>handler</code> function to stream a response. First, add <code>tokio-stream</code> as a dependency:</p>
<pre style="background-color:#2b303b;color:#c0c5ce;"><code><span>$ cargo add tokio-stream
</span></code></pre>
<p>We then need to modify our imports:</p>
<pre data-lang="diff" style="background-color:#2b303b;color:#c0c5ce;" class="language-diff "><code class="language-diff" data-lang="diff"><span> use axum::{
</span><span style="color:#a3be8c;">+ body::StreamBody,
</span><span> extract::Host,
</span><span> handler::HandlerWithoutStateExt,
</span><span> http::{StatusCode, Uri},
</span><span style="color:#a3be8c;">+ response::IntoResponse,
</span><span> response::Redirect,
</span><span style="color:#a3be8c;">+ response::Response,
</span><span> routing::get,
</span><span> BoxError, Router,
</span><span> };
</span><span> use axum_server::tls_rustls::RustlsConfig;
</span><span style="color:#bf616a;">-use std::{net::SocketAddr, path::PathBuf};
</span><span style="color:#a3be8c;">+use std::{convert::Infallible, net::SocketAddr, path::PathBuf};
</span><span style="color:#a3be8c;">+use tokio::sync::mpsc;
</span><span> use tracing_subscriber::{layer::SubscriberExt, util::SubscriberInitExt};
</span></code></pre>
<p>Finally, we can replace the existing handler with one that streams a body:</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span>async </span><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">handler</span><span>() -> impl IntoResponse {
</span><span> </span><span style="color:#b48ead;">let </span><span>(tx, rx) = mpsc::channel::<Result<String, Infallible>>(</span><span style="color:#d08770;">2</span><span>);
</span><span>
</span><span> tokio::spawn(async </span><span style="color:#b48ead;">move </span><span>{
</span><span> tx.</span><span style="color:#96b5b4;">send</span><span>(Ok("</span><span style="color:#a3be8c;">hello...</span><span>".</span><span style="color:#96b5b4;">to_string</span><span>())).await.</span><span style="color:#96b5b4;">unwrap</span><span>();
</span><span> tokio::time::sleep(std::time::Duration::from_secs(</span><span style="color:#d08770;">2</span><span>)).await;
</span><span> tx.</span><span style="color:#96b5b4;">send</span><span>(Ok("</span><span style="color:#a3be8c;">world</span><span>".</span><span style="color:#96b5b4;">to_string</span><span>())).await.</span><span style="color:#96b5b4;">unwrap</span><span>();
</span><span> });
</span><span>
</span><span> </span><span style="color:#b48ead;">let</span><span> stream = tokio_stream::wrappers::ReceiverStream::new(rx);
</span><span> </span><span style="color:#b48ead;">let</span><span> body = StreamBody::new(stream);
</span><span>
</span><span> Response::builder()
</span><span> .</span><span style="color:#96b5b4;">status</span><span>(StatusCode::</span><span style="color:#d08770;">OK</span><span>)
</span><span> .</span><span style="color:#96b5b4;">body</span><span>(body)
</span><span> .</span><span style="color:#96b5b4;">unwrap</span><span>()
</span><span>}
</span></code></pre>
<p>We spawn a <em>task</em> that will send <code>hello...</code>, wait 2 seconds and then send <code>world</code>. Hyper knows how to correctly process a stream, but does not know what do with the <em>receiver</em> from the <code>mpsc::channel</code>. We use <code>tokio-stream</code> to convert the receiver into a stream and use that as our response body.</p>
<p>Note: HTTP/2 does not use a <code>Transfer-Encoding</code> header. You can add one, but hyper will properly strip it out.</p>
<p>We can test that our response is now streaming a body using curl.</p>
<pre style="background-color:#2b303b;color:#c0c5ce;"><code><span>$ curl -k --no-buffer https://localhost:3000/
</span><span>hello...world%
</span></code></pre>
<p>With the <code>--no-buffer</code> flag, you should notice a pause between <code>hello...</code> and <code>world</code>.</p>
<h3 id="sending-trailers"><a class="zola-anchor" href="#sending-trailers" aria-label="Anchor link for: sending-trailers">Sending Trailers</a></h3>
<p>In http-body v0.4.4, the <a href="https://github.com/hyperium/http-body/blob/a97da649b6dc93660931fc6f0bdb6aa2db64e50d/src/lib.rs#L56-L62">Body</a> trait has a <code>poll_trailers</code> method handles the sending of trailers at the end of the body. In axum v0.6, <a href="https://github.com/tokio-rs/axum/blob/1e5be5bb693f825ece664518f3aa6794f03bfec6/axum/src/body/stream_body.rs">StreamBody</a> always returns <code>None</code>:</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">poll_trailers</span><span>(
</span><span> </span><span style="color:#bf616a;">self</span><span>: Pin<&</span><span style="color:#b48ead;">mut Self</span><span>>,
</span><span> </span><span style="color:#bf616a;">_cx</span><span>: &</span><span style="color:#b48ead;">mut </span><span>Context<'_>,
</span><span>) -> Poll<Result<Option<HeaderMap>, </span><span style="color:#b48ead;">Self::</span><span>Error>> {
</span><span> Poll::Ready(Ok(None))
</span><span>}
</span></code></pre>
<h4 id="custom-streambody"><a class="zola-anchor" href="#custom-streambody" aria-label="Anchor link for: custom-streambody">Custom <code>StreamBody</code></a></h4>
<p>We can start from axum's <code>StreamBody</code> implementation and add support for trailers.</p>
<p>Copy the <a href="https://github.com/tokio-rs/axum/blob/1e5be5bb693f825ece664518f3aa6794f03bfec6/axum/src/body/stream_body.rs">StreamBody</a> implementation from axum to our server:</p>
<pre style="background-color:#2b303b;color:#c0c5ce;"><code><span>curl --silent "https://raw.githubusercontent.com/tokio-rs/axum/1e5be5bb693f825ece664518f3aa6794f03bfec6/axum/src/body/stream_body.rs" --output src/stream_body.rs
</span></code></pre>
<p>We need to make some changes to the import statments in <code>src/stream_body.rs</code>:</p>
<ol>
<li>Rename <code>crate</code> to <code>axum</code></li>
<li>Remove <code>use http::HeaderMap</code> as axum re-exports this dependency</li>
<li>Add <code>http::HeaderMap</code> to the existing <code>use axum { ... }</code> import.</li>
</ol>
<pre data-lang="diff" style="background-color:#2b303b;color:#c0c5ce;" class="language-diff "><code class="language-diff" data-lang="diff"><span style="color:#bf616a;">-use crate::{
</span><span style="color:#a3be8c;">+use axum::{
</span><span> body::{self, Bytes, HttpBody},
</span><span style="color:#a3be8c;">+ http::HeaderMap,
</span><span> response::{IntoResponse, Response},
</span><span> BoxError, Error,
</span><span> };
</span><span> ready,
</span><span> stream::{self, TryStream},
</span><span> };
</span><span style="color:#bf616a;">-use http::HeaderMap;
</span><span> use pin_project_lite::pin_project;
</span><span> use std::{
</span><span> fmt,
</span></code></pre>
<p>We then modify the <code>StreamBody</code> struct to include <code>trailers</code>. This will allow us to store the trailers in our response.</p>
<pre data-lang="diff" style="background-color:#2b303b;color:#c0c5ce;" class="language-diff "><code class="language-diff" data-lang="diff"><span> pub struct StreamBody<S> {
</span><span> #[pin]
</span><span> stream: SyncWrapper<S>,
</span><span style="color:#a3be8c;">+ trailers: Option<HeaderMap>,
</span><span> }
</span><span> }
</span></code></pre>
<p>We also need to set <code>trailers</code> to <code>None</code> when creating a new stream:</p>
<pre data-lang="diff" style="background-color:#2b303b;color:#c0c5ce;" class="language-diff "><code class="language-diff" data-lang="diff"><span> pub fn new(stream: S) -> Self
</span><span> where
</span><span> S: TryStream + Send + 'static,
</span><span> S::Ok: Into<Bytes>,
</span><span> S::Error: Into<BoxError>,
</span><span> {
</span><span> Self {
</span><span> stream: SyncWrapper::new(stream),
</span><span style="color:#a3be8c;">+ trailers: None,
</span><span> }
</span><span> }
</span><span> }
</span><span>
</span><span> impl<S> IntoResponse for StreamBody<S>
</span></code></pre>
<p>Add a <code>set_trailers</code> method to <code>StreamBody</code> so we can add trailer headers from our response:</p>
<pre data-lang="diff" style="background-color:#2b303b;color:#c0c5ce;" class="language-diff "><code class="language-diff" data-lang="diff"><span style="color:#a3be8c;">+ pub fn set_trailers(&mut self, headers: HeaderMap) {
</span><span style="color:#a3be8c;">+ self.trailers = Some(headers);
</span><span style="color:#a3be8c;">+ }
</span></code></pre>
<p>Finally, modify <code>poll_trailers</code> to send any headers we set:</p>
<pre data-lang="diff" style="background-color:#2b303b;color:#c0c5ce;" class="language-diff "><code class="language-diff" data-lang="diff"><span> fn poll_trailers(
</span><span> self: Pin<&mut Self>,
</span><span> _cx: &mut Context<'_>,
</span><span> ) -> Poll<Result<Option<HeaderMap>, Self::Error>> {
</span><span style="color:#bf616a;">- Poll::Ready(Ok(None)
</span><span style="color:#a3be8c;">+ Poll::Ready(Ok(self.project().trailers.take()))
</span><span> }
</span></code></pre>
<h4 id="update-response"><a class="zola-anchor" href="#update-response" aria-label="Anchor link for: update-response">Update Response</a></h4>
<p>Now that we have a <code>StreamBody</code> implementaiton that will send headers, we can update <code>handler</code> in <code>src/main.rs</code> to include trailers.</p>
<p>Update the imports to use the <code>StreamBody</code> we just created:</p>
<pre data-lang="diff" style="background-color:#2b303b;color:#c0c5ce;" class="language-diff "><code class="language-diff" data-lang="diff"><span style="color:#a3be8c;">+mod stream_body;
</span><span style="color:#a3be8c;">+
</span><span> use axum::{
</span><span style="color:#bf616a;">- body::StreamBody,
</span><span> extract::Host,
</span><span> handler::HandlerWithoutStateExt,
</span><span> http::{StatusCode, Uri},
</span><span>@@ -20,6 +21,8 @@ </span><span style="color:#8fa1b3;">use std::{convert::Infallible, net::SocketAddr, path::PathBuf};
</span><span> use tokio::sync::mpsc;
</span><span> use tracing_subscriber::{layer::SubscriberExt, util::SubscriberInitExt};
</span><span>
</span><span style="color:#a3be8c;">+use crate::stream_body::StreamBody;
</span></code></pre>
<p>We modify our response to include a header:</p>
<pre data-lang="diff" style="background-color:#2b303b;color:#c0c5ce;" class="language-diff "><code class="language-diff" data-lang="diff"><span> let stream = tokio_stream::wrappers::ReceiverStream::new(rx);
</span><span style="color:#bf616a;">- let body = StreamBody::new(stream);
</span><span style="color:#a3be8c;">+ let mut body = StreamBody::new(stream);
</span><span style="color:#a3be8c;">+ let mut headers = axum::http::HeaderMap::new();
</span><span style="color:#a3be8c;">+ headers.insert("chunky-trailer", "foo".parse().unwrap());
</span><span style="color:#a3be8c;">+
</span><span style="color:#a3be8c;">+ body.set_trailers(headers);
</span><span>
</span><span> Response::builder()
</span><span> .status(StatusCode::OK)
</span><span style="color:#a3be8c;">+ .header("Trailers", "chunky-trailer")
</span><span> .body(body)
</span><span> .unwrap()
</span></code></pre>
<p>Note: we must include a <code>Trailers</code> header that names the trailer headers we want to send.</p>
<p>We can use curl to verify that our trailer header is sent. Note that we must include the verbose flag, <code>-v</code>, in order to see the headers.</p>
<pre style="background-color:#2b303b;color:#c0c5ce;"><code><span>$ curl -v -k --no-buffer https://localhost:3000/
</span><span>...snip
</span><span>> GET / HTTP/2
</span><span>> Host: localhost:3000
</span><span>> user-agent: curl/7.79.1
</span><span>> accept: */*
</span><span>>
</span><span>
</span><span>< HTTP/2 200
</span><span>< trailers: chunky-trailer
</span><span>< date: Thu, 19 Oct 2023 22:28:06 GMT
</span><span><
</span><span>hello...world< chunky-trailer: foo
</span><span>* Connection #0 to host localhost left intact
</span></code></pre>
<p>Note: the <code>< chunky-trailer: foo</code> is on the same line as <code>hello...world</code> because we did not buffer the body.</p>
<p>You can find the complete source code at <a href="https://github.com/hjr3/axum-trailers">https://github.com/hjr3/axum-trailers</a></p>
http://activitystrea.ms/schema/1.0/postThe Essence of CQRS2023-06-12T00:00:00+00:002023-06-12T00:00:00+00:00Herman J. Radtke IIIhttp://activitystrea.ms/schema/1.0/personhttps://hermanradtke.com/about/https://hermanradtke.com/the-essence-of-cqrs/<p>I have been familiar with the concept of Command Query Responsibility Segregation (CQRS) for a while, but did not truly understand its practical application until I read <a href="https://engineering.linkedin.com/distributed-systems/log-what-every-software-engineer-should-know-about-real-time-datas-unifying">The Log: What every software engineer should know about real-time data's unifying abstraction</a> by Jay Kreps.</p>
<p>I have been familiar with the concept of Command Query Responsibility Segregation (CQRS) for a while, but did not truly understand its practical application until I read <a href="https://engineering.linkedin.com/distributed-systems/log-what-every-software-engineer-should-know-about-real-time-datas-unifying">The Log: What every software engineer should know about real-time data's unifying abstraction</a> by Jay Kreps.</p>
<span id="continue-reading"></span>
<p>When I read articles explaning CQRS, most are like the one on <a href="https://martinfowler.com/bliki/CQRS.html">Martin Fowler's Bliki</a>. The concept of separating reads and writes makes sense but the <em>why</em> is lost in a bunch of hand-wavy examples mostly relating to event sourcing. Then I saw this diagram from <a href="https://engineering.linkedin.com/distributed-systems/log-what-every-software-engineer-should-know-about-real-time-datas-unifying">Jay Krep's article on logs</a>:</p>
<center>
<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" version="1.1" width="121px" height="271px" viewBox="-0.5 -0.5 121 271"><defs/><g><path d="M 90 60 L 102 60 L 102 203.63" fill="none" stroke="rgba(0, 0, 0, 1)" stroke-miterlimit="10" pointer-events="stroke"/><path d="M 102 208.88 L 98.5 201.88 L 102 203.63 L 105.5 201.88 Z" fill="rgba(0, 0, 0, 1)" stroke="rgba(0, 0, 0, 1)" stroke-miterlimit="10" pointer-events="all"/><g transform="translate(-0.5 -0.5)"><switch><foreignObject style="overflow: visible; text-align: left;" pointer-events="none" width="100%" height="100%" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: flex; align-items: unsafe center; justify-content: unsafe center; width: 1px; height: 1px; padding-top: 81px; margin-left: 101px;"><div style="box-sizing: border-box; font-size: 0px; text-align: center;" data-drawio-colors="color: rgba(0, 0, 0, 1); background-color: rgba(255, 255, 255, 1); "><div style="display: inline-block; font-size: 11px; font-family: Helvetica; color: rgb(0, 0, 0); line-height: 1.2; pointer-events: all; background-color: rgb(255, 255, 255); white-space: nowrap;">writes</div></div></div></foreignObject><text x="101" y="84" fill="rgba(0, 0, 0, 1)" font-family="Helvetica" font-size="11px" text-anchor="middle">writes</text></switch></g><path d="M 30 60 L 29.97 93.63" fill="none" stroke="rgba(0, 0, 0, 1)" stroke-miterlimit="10" pointer-events="stroke"/><path d="M 29.97 98.88 L 26.48 91.88 L 29.97 93.63 L 33.48 91.88 Z" fill="rgba(0, 0, 0, 1)" stroke="rgba(0, 0, 0, 1)" stroke-miterlimit="10" pointer-events="all"/><g transform="translate(-0.5 -0.5)"><switch><foreignObject style="overflow: visible; text-align: left;" pointer-events="none" width="100%" height="100%" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: flex; align-items: unsafe center; justify-content: unsafe center; width: 1px; height: 1px; padding-top: 80px; margin-left: 28px;"><div style="box-sizing: border-box; font-size: 0px; text-align: center;" data-drawio-colors="color: rgba(0, 0, 0, 1); background-color: rgba(255, 255, 255, 1); "><div style="display: inline-block; font-size: 11px; font-family: Helvetica; color: rgb(0, 0, 0); line-height: 1.2; pointer-events: all; background-color: rgb(255, 255, 255); white-space: nowrap;">reads</div></div></div></foreignObject><text x="28" y="84" fill="rgba(0, 0, 0, 1)" font-family="Helvetica" font-size="11px" text-anchor="middle">reads</text></switch></g><rect x="0" y="0" width="120" height="60" fill="rgba(255, 255, 255, 1)" stroke="rgba(0, 0, 0, 1)" pointer-events="all"/><g transform="translate(-0.5 -0.5)"><switch><foreignObject style="overflow: visible; text-align: left;" pointer-events="none" width="100%" height="100%" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: flex; align-items: unsafe center; justify-content: unsafe center; width: 118px; height: 1px; padding-top: 30px; margin-left: 1px;"><div style="box-sizing: border-box; font-size: 0px; text-align: center;" data-drawio-colors="color: rgba(0, 0, 0, 1); "><div style="display: inline-block; font-size: 12px; font-family: Helvetica; color: rgb(0, 0, 0); line-height: 1.2; pointer-events: all; white-space: normal; overflow-wrap: normal;">Clients</div></div></div></foreignObject><text x="60" y="34" fill="rgba(0, 0, 0, 1)" font-family="Helvetica" font-size="12px" text-anchor="middle">Clients</text></switch></g><rect x="0" y="100" width="90" height="60" fill="rgba(255, 255, 255, 1)" stroke="rgba(0, 0, 0, 1)" pointer-events="all"/><g transform="translate(-0.5 -0.5)"><switch><foreignObject style="overflow: visible; text-align: left;" pointer-events="none" width="100%" height="100%" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: flex; align-items: unsafe center; justify-content: unsafe center; width: 88px; height: 1px; padding-top: 130px; margin-left: 1px;"><div style="box-sizing: border-box; font-size: 0px; text-align: center;" data-drawio-colors="color: rgba(0, 0, 0, 1); "><div style="display: inline-block; font-size: 12px; font-family: Helvetica; color: rgb(0, 0, 0); line-height: 1.2; pointer-events: all; white-space: normal; overflow-wrap: normal;">Serving Nodes</div></div></div></foreignObject><text x="45" y="134" fill="rgba(0, 0, 0, 1)" font-family="Helvetica" font-size="12px" text-anchor="middle">Serving Nodes</text></switch></g><path d="M 30 210 L 29.97 166.37" fill="none" stroke="rgba(0, 0, 0, 1)" stroke-miterlimit="10" pointer-events="stroke"/><path d="M 29.97 161.12 L 33.47 168.12 L 29.97 166.37 L 26.47 168.12 Z" fill="rgba(0, 0, 0, 1)" stroke="rgba(0, 0, 0, 1)" stroke-miterlimit="10" pointer-events="all"/><g transform="translate(-0.5 -0.5)"><switch><foreignObject style="overflow: visible; text-align: left;" pointer-events="none" width="100%" height="100%" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: flex; align-items: unsafe center; justify-content: unsafe center; width: 1px; height: 1px; padding-top: 187px; margin-left: 28px;"><div style="box-sizing: border-box; font-size: 0px; text-align: center;" data-drawio-colors="color: rgba(0, 0, 0, 1); background-color: rgba(255, 255, 255, 1); "><div style="display: inline-block; font-size: 11px; font-family: Helvetica; color: rgb(0, 0, 0); line-height: 1.2; pointer-events: all; background-color: rgb(255, 255, 255); white-space: nowrap;">writes</div></div></div></foreignObject><text x="28" y="190" fill="rgba(0, 0, 0, 1)" font-family="Helvetica" font-size="11px" text-anchor="middle">writes</text></switch></g><rect x="0" y="210" width="120" height="60" fill="rgba(255, 255, 255, 1)" stroke="rgba(0, 0, 0, 1)" pointer-events="all"/><g transform="translate(-0.5 -0.5)"><switch><foreignObject style="overflow: visible; text-align: left;" pointer-events="none" width="100%" height="100%" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: flex; align-items: unsafe center; justify-content: unsafe center; width: 118px; height: 1px; padding-top: 240px; margin-left: 1px;"><div style="box-sizing: border-box; font-size: 0px; text-align: center;" data-drawio-colors="color: rgba(0, 0, 0, 1); "><div style="display: inline-block; font-size: 12px; font-family: Helvetica; color: rgb(0, 0, 0); line-height: 1.2; pointer-events: all; white-space: normal; overflow-wrap: normal;">Log</div></div></div></foreignObject><text x="60" y="244" fill="rgba(0, 0, 0, 1)" font-family="Helvetica" font-size="12px" text-anchor="middle">Log</text></switch></g></g><switch><g requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"/><a transform="translate(0,-5)" xlink:href="https://www.diagrams.net/doc/faq/svg-export-text-problems" target="_blank"><text text-anchor="middle" font-size="10px" x="50%" y="100%">Viewer does not support full SVG 1.1</text></a></switch></svg>
</center>
<p>I now understand the <em>essence</em> of what CQRS brings to distributed systems. We can build a service to send writes to something like Kafka and reads to something like Postgres (or any other data store).</p>
<center>
<svg xmlns="http://www.w3.org/2000/svg" style="background-color: rgb(255, 255, 255);" xmlns:xlink="http://www.w3.org/1999/xlink" version="1.1" width="121px" height="271px" viewBox="-0.5 -0.5 121 271" content="<mxfile host="app.diagrams.net" modified="2023-06-12T14:07:15.367Z" agent="5.0 (Macintosh)" etag="G50WnQQX-FjSpCM_zbRX" version="15.8.6"><diagram id="rqCtcK7Q9IwAA8DcKYVc" name="Page-1">3VhNc9owEP01HJOxLRzMka8mM6UdOhzaHBV7sdXalisLbOfXV0YStmqgJSmF5MJ4n3bX0r59i6CHJkl5z3AWfaIBxD3HCsoemvYcZ9i/E581UEngznYlEDISSMhugCV5BgVaCl2TAHLDkVMac5KZoE/TFHxuYJgxWphuKxqbb81wCB1g6eO4i34lAY8k6rlWgz8ACSP9ZttSKwnWzgrIIxzQogWhWQ9NGKVcPiXlBOK6drouMu7DgdXdxhik/G8CRl/yZfGzvP/szzfYW+QPY3d801d745U+MATi/MqkjEc0pCmOZw06ZnSdBlBntYTV+MwpzQRoC/A7cF4pMvGaUwFFPInVKpSEf6vDbweuMh9bS9NSpd4alTZSzioZ5bnafmwvNnFbywhcACMJcGAKlMeuz3qwmgrK6Zr5cKSEuisxC4Ef8UM7zoVWgIrdsErEMYgxJxtzH1h1bbjzU6EjxnDVcsgoSXneyryoAeGg9NcfOjKjkp/jGU0iHmRGbbW21kDbRjqhqYbylRscr9WxCkY45Ht7bY6fxMQw+gPHJEzFsy+4qBkbb4BxIjQ5UgsJCQLZipCTZ/y0zVfTqsohkrvjnjvdEV0ngHLfwFDBjUzbLXBYMl0eVfYb69ayHWQWXVqnMd2hUmu10qZnZqCrVS4a0GT3n/DpXnZIOC8aEgihtzglnFdOCYP7U4n2OsJlgIN3olv3qG6FbC3LNhX2St2eX5h2h68lsA0Rjfg7Y6Yci0iM42WGtx1biHubyePB2nca/2A50dAyR6CmqWguUbbGotYF6s46U287nVotaM5Dtudr6dLFQnuKNfyftRpc0cC3Xjjw7QsPfPQWBr7+lfVOr2qDP17VzDvV1Q981KHrI179wFc3wfrofONemM2PZlnY5p8HNPsF</diagram></mxfile>"><defs/><g><path d="M 90 60 L 102 60 L 102 203.63" fill="none" stroke="rgba(0, 0, 0, 1)" stroke-miterlimit="10" pointer-events="stroke"/><path d="M 102 208.88 L 98.5 201.88 L 102 203.63 L 105.5 201.88 Z" fill="rgba(0, 0, 0, 1)" stroke="rgba(0, 0, 0, 1)" stroke-miterlimit="10" pointer-events="all"/><g transform="translate(-0.5 -0.5)"><switch><foreignObject style="overflow: visible; text-align: left;" pointer-events="none" width="100%" height="100%" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: flex; align-items: unsafe center; justify-content: unsafe center; width: 1px; height: 1px; padding-top: 81px; margin-left: 101px;"><div style="box-sizing: border-box; font-size: 0px; text-align: center;" data-drawio-colors="color: rgba(0, 0, 0, 1); background-color: rgba(255, 255, 255, 1); "><div style="display: inline-block; font-size: 11px; font-family: Helvetica; color: rgb(0, 0, 0); line-height: 1.2; pointer-events: all; background-color: rgb(255, 255, 255); white-space: nowrap;">writes</div></div></div></foreignObject><text x="101" y="84" fill="rgba(0, 0, 0, 1)" font-family="Helvetica" font-size="11px" text-anchor="middle">writes</text></switch></g><path d="M 30 60 L 29.97 93.63" fill="none" stroke="rgba(0, 0, 0, 1)" stroke-miterlimit="10" pointer-events="stroke"/><path d="M 29.97 98.88 L 26.48 91.88 L 29.97 93.63 L 33.48 91.88 Z" fill="rgba(0, 0, 0, 1)" stroke="rgba(0, 0, 0, 1)" stroke-miterlimit="10" pointer-events="all"/><g transform="translate(-0.5 -0.5)"><switch><foreignObject style="overflow: visible; text-align: left;" pointer-events="none" width="100%" height="100%" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: flex; align-items: unsafe center; justify-content: unsafe center; width: 1px; height: 1px; padding-top: 80px; margin-left: 28px;"><div style="box-sizing: border-box; font-size: 0px; text-align: center;" data-drawio-colors="color: rgba(0, 0, 0, 1); background-color: rgba(255, 255, 255, 1); "><div style="display: inline-block; font-size: 11px; font-family: Helvetica; color: rgb(0, 0, 0); line-height: 1.2; pointer-events: all; background-color: rgb(255, 255, 255); white-space: nowrap;">reads</div></div></div></foreignObject><text x="28" y="84" fill="rgba(0, 0, 0, 1)" font-family="Helvetica" font-size="11px" text-anchor="middle">reads</text></switch></g><rect x="0" y="0" width="120" height="60" fill="rgba(255, 255, 255, 1)" stroke="rgba(0, 0, 0, 1)" pointer-events="all"/><g transform="translate(-0.5 -0.5)"><switch><foreignObject style="overflow: visible; text-align: left;" pointer-events="none" width="100%" height="100%" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: flex; align-items: unsafe center; justify-content: unsafe center; width: 118px; height: 1px; padding-top: 30px; margin-left: 1px;"><div style="box-sizing: border-box; font-size: 0px; text-align: center;" data-drawio-colors="color: rgba(0, 0, 0, 1); "><div style="display: inline-block; font-size: 12px; font-family: Helvetica; color: rgb(0, 0, 0); line-height: 1.2; pointer-events: all; white-space: normal; overflow-wrap: normal;">Service</div></div></div></foreignObject><text x="60" y="34" fill="rgba(0, 0, 0, 1)" font-family="Helvetica" font-size="12px" text-anchor="middle">Service</text></switch></g><rect x="0" y="100" width="90" height="60" fill="rgba(255, 255, 255, 1)" stroke="rgba(0, 0, 0, 1)" pointer-events="all"/><g transform="translate(-0.5 -0.5)"><switch><foreignObject style="overflow: visible; text-align: left;" pointer-events="none" width="100%" height="100%" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: flex; align-items: unsafe center; justify-content: unsafe center; width: 88px; height: 1px; padding-top: 130px; margin-left: 1px;"><div style="box-sizing: border-box; font-size: 0px; text-align: center;" data-drawio-colors="color: rgba(0, 0, 0, 1); "><div style="display: inline-block; font-size: 12px; font-family: Helvetica; color: rgb(0, 0, 0); line-height: 1.2; pointer-events: all; white-space: normal; overflow-wrap: normal;">Postgres</div></div></div></foreignObject><text x="45" y="134" fill="rgba(0, 0, 0, 1)" font-family="Helvetica" font-size="12px" text-anchor="middle">Postgres</text></switch></g><path d="M 30 210 L 29.97 166.37" fill="none" stroke="rgba(0, 0, 0, 1)" stroke-miterlimit="10" pointer-events="stroke"/><path d="M 29.97 161.12 L 33.47 168.12 L 29.97 166.37 L 26.47 168.12 Z" fill="rgba(0, 0, 0, 1)" stroke="rgba(0, 0, 0, 1)" stroke-miterlimit="10" pointer-events="all"/><g transform="translate(-0.5 -0.5)"><switch><foreignObject style="overflow: visible; text-align: left;" pointer-events="none" width="100%" height="100%" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: flex; align-items: unsafe center; justify-content: unsafe center; width: 1px; height: 1px; padding-top: 187px; margin-left: 28px;"><div style="box-sizing: border-box; font-size: 0px; text-align: center;" data-drawio-colors="color: rgba(0, 0, 0, 1); background-color: rgba(255, 255, 255, 1); "><div style="display: inline-block; font-size: 11px; font-family: Helvetica; color: rgb(0, 0, 0); line-height: 1.2; pointer-events: all; background-color: rgb(255, 255, 255); white-space: nowrap;">writes</div></div></div></foreignObject><text x="28" y="190" fill="rgba(0, 0, 0, 1)" font-family="Helvetica" font-size="11px" text-anchor="middle">writes</text></switch></g><rect x="0" y="210" width="120" height="60" fill="rgba(255, 255, 255, 1)" stroke="rgba(0, 0, 0, 1)" pointer-events="all"/><g transform="translate(-0.5 -0.5)"><switch><foreignObject style="overflow: visible; text-align: left;" pointer-events="none" width="100%" height="100%" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: flex; align-items: unsafe center; justify-content: unsafe center; width: 118px; height: 1px; padding-top: 240px; margin-left: 1px;"><div style="box-sizing: border-box; font-size: 0px; text-align: center;" data-drawio-colors="color: rgba(0, 0, 0, 1); "><div style="display: inline-block; font-size: 12px; font-family: Helvetica; color: rgb(0, 0, 0); line-height: 1.2; pointer-events: all; white-space: normal; overflow-wrap: normal;">Kafka</div></div></div></foreignObject><text x="60" y="244" fill="rgba(0, 0, 0, 1)" font-family="Helvetica" font-size="12px" text-anchor="middle">Kafka</text></switch></g></g><switch><g requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"/><a transform="translate(0,-5)" xlink:href="https://www.diagrams.net/doc/faq/svg-export-text-problems" target="_blank"><text text-anchor="middle" font-size="10px" x="50%" y="100%">Viewer does not support full SVG 1.1</text></a></switch></svg>
</center>
<p>The value of this approach is not for the service itself (though it can be helpful), but the rest of the system. Separating commands allows other systems (like Search, Analytics, Monitoring, etc) to read from the same stream of commands.</p>
<center>
<svg xmlns="http://www.w3.org/2000/svg" style="background-color: rgb(255, 255, 255);" xmlns:xlink="http://www.w3.org/1999/xlink" version="1.1" width="461px" height="271px" viewBox="-0.5 -0.5 461 271" content="<mxfile host="app.diagrams.net" modified="2023-06-12T14:15:04.599Z" agent="5.0 (Macintosh)" etag="EH52wurUC0SzqnpzNJtG" version="15.8.6"><diagram id="rqCtcK7Q9IwAA8DcKYVc" name="Page-1">3VlNc9owEP01HJOxLIzhGEKazJR2yHBoc1SwsNUai8oy2Pn1lZGErYiPEBJMuDDeJ+1a2t23lpYWvJ3l9wzNox80wHHLdYK8BQct1+21O+K3BAoJdIAngZCRQEKgAsbkBSvQUWhGApwaEzmlMSdzE5zQJMETbmCIMbo0p01pbL51jkJsAeMJim30Fwl4JNGu51T4AyZhpN8MHDUyQ3qyAtIIBXRZg+BdC94ySrl8muW3OC59p/0i9b5tGV0vjOGEv0Xh5jEdL//l9z8nwwXqjtKHvte/aqu18UJvGAdi/0qkjEc0pAmK7yq0z2iWBLi06gipmjOkdC5AIMA/mPNCBRNlnAoo4rNYjeKc8N+l+rXvKfGpNjTIlemVUGgh4ayQWl1Py0/1wUpvJRmKI8zIDHPMFCi3Xe51qzcVlNKMTfAOF+qsRCzEfMc8uI654AqmYjWsEHoMx4iThbkOpLI2XM9TqjeMoaI2YU5JwtOa5VEJiAmKf+2eKy0q+rldI0nEg7SopdrSKmiVSAckVU++coHiTG1ryQjH6cZcG6JnUTGM/EAxCRPxPBGxKCPWX2DGieDkjRqYkSCQqYhT8oKeV/bKsCp3CONev+UN1oEuDeB8U8FQyhVN6ymwnTJ2HJX1K+faAS40nS6lwyJthVJztdBi17RAp9NUJKAZ3Q+Jp9dskXDfVSQghF+xSrhHVgkj9ocGumsRl2EUXAhvvZ28FbR1HGAy7Ejefj4xgRWvMWYLIhLxdcRMOi4jUY7Hc7TK2KU4t5lx3Op7K/G3uhP2HLME6jAtq0MU0FhUO0B1nE/Kbdfy1YimPGQbPktNOwtucFbvlL7yz6jgO+8s+KDhgg+/QsHXt6wLPar5e49q5pnq/At+s/e1Oi0rlm5mZgNUAlvK7Im45F80l8Dee4+4XHY+kk7asvvqWAZ908Qn0q3Rmw84hG61D6HRHdn3GWyCps1+8uxLzkXRdN81p+0ZbAIfwtHXDD0ZQTvnQ9DL4eexncrj+HnZ3UOZsTv46Ttmx/bYLkRumtFWT9Y8hFY4v6PpX2RFs+lLdxs23aEAdjtnmIksbr6b4/pn1qAAdjenT8LHDCvbTTrLc8/NWTYFB4ijAQ0b91WnfTJfCbH6r1UWt+oPa3j3Hw==</diagram></mxfile>"><defs/><g><path d="M 210 60 L 222 60 L 222 203.63" fill="none" stroke="rgba(0, 0, 0, 1)" stroke-miterlimit="10" pointer-events="stroke"/><path d="M 222 208.88 L 218.5 201.88 L 222 203.63 L 225.5 201.88 Z" fill="rgba(0, 0, 0, 1)" stroke="rgba(0, 0, 0, 1)" stroke-miterlimit="10" pointer-events="all"/><g transform="translate(-0.5 -0.5)"><switch><foreignObject style="overflow: visible; text-align: left;" pointer-events="none" width="100%" height="100%" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: flex; align-items: unsafe center; justify-content: unsafe center; width: 1px; height: 1px; padding-top: 81px; margin-left: 221px;"><div style="box-sizing: border-box; font-size: 0px; text-align: center;" data-drawio-colors="color: rgba(0, 0, 0, 1); background-color: rgba(255, 255, 255, 1); "><div style="display: inline-block; font-size: 11px; font-family: Helvetica; color: rgb(0, 0, 0); line-height: 1.2; pointer-events: all; background-color: rgb(255, 255, 255); white-space: nowrap;">writes</div></div></div></foreignObject><text x="221" y="84" fill="rgba(0, 0, 0, 1)" font-family="Helvetica" font-size="11px" text-anchor="middle">writes</text></switch></g><path d="M 150 60 L 149.97 93.63" fill="none" stroke="rgba(0, 0, 0, 1)" stroke-miterlimit="10" pointer-events="stroke"/><path d="M 149.97 98.88 L 146.48 91.88 L 149.97 93.63 L 153.48 91.88 Z" fill="rgba(0, 0, 0, 1)" stroke="rgba(0, 0, 0, 1)" stroke-miterlimit="10" pointer-events="all"/><g transform="translate(-0.5 -0.5)"><switch><foreignObject style="overflow: visible; text-align: left;" pointer-events="none" width="100%" height="100%" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: flex; align-items: unsafe center; justify-content: unsafe center; width: 1px; height: 1px; padding-top: 80px; margin-left: 148px;"><div style="box-sizing: border-box; font-size: 0px; text-align: center;" data-drawio-colors="color: rgba(0, 0, 0, 1); background-color: rgba(255, 255, 255, 1); "><div style="display: inline-block; font-size: 11px; font-family: Helvetica; color: rgb(0, 0, 0); line-height: 1.2; pointer-events: all; background-color: rgb(255, 255, 255); white-space: nowrap;">reads</div></div></div></foreignObject><text x="148" y="84" fill="rgba(0, 0, 0, 1)" font-family="Helvetica" font-size="11px" text-anchor="middle">reads</text></switch></g><rect x="120" y="0" width="120" height="60" fill="rgba(255, 255, 255, 1)" stroke="rgba(0, 0, 0, 1)" pointer-events="all"/><g transform="translate(-0.5 -0.5)"><switch><foreignObject style="overflow: visible; text-align: left;" pointer-events="none" width="100%" height="100%" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: flex; align-items: unsafe center; justify-content: unsafe center; width: 118px; height: 1px; padding-top: 30px; margin-left: 121px;"><div style="box-sizing: border-box; font-size: 0px; text-align: center;" data-drawio-colors="color: rgba(0, 0, 0, 1); "><div style="display: inline-block; font-size: 12px; font-family: Helvetica; color: rgb(0, 0, 0); line-height: 1.2; pointer-events: all; white-space: normal; overflow-wrap: normal;">Service</div></div></div></foreignObject><text x="180" y="34" fill="rgba(0, 0, 0, 1)" font-family="Helvetica" font-size="12px" text-anchor="middle">Service</text></switch></g><rect x="120" y="100" width="90" height="60" fill="rgba(255, 255, 255, 1)" stroke="rgba(0, 0, 0, 1)" pointer-events="all"/><g transform="translate(-0.5 -0.5)"><switch><foreignObject style="overflow: visible; text-align: left;" pointer-events="none" width="100%" height="100%" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: flex; align-items: unsafe center; justify-content: unsafe center; width: 88px; height: 1px; padding-top: 130px; margin-left: 121px;"><div style="box-sizing: border-box; font-size: 0px; text-align: center;" data-drawio-colors="color: rgba(0, 0, 0, 1); "><div style="display: inline-block; font-size: 12px; font-family: Helvetica; color: rgb(0, 0, 0); line-height: 1.2; pointer-events: all; white-space: normal; overflow-wrap: normal;">Postgres</div></div></div></foreignObject><text x="165" y="134" fill="rgba(0, 0, 0, 1)" font-family="Helvetica" font-size="12px" text-anchor="middle">Postgres</text></switch></g><path d="M 150 210 L 149.97 166.37" fill="none" stroke="rgba(0, 0, 0, 1)" stroke-miterlimit="10" pointer-events="stroke"/><path d="M 149.97 161.12 L 153.47 168.12 L 149.97 166.37 L 146.47 168.12 Z" fill="rgba(0, 0, 0, 1)" stroke="rgba(0, 0, 0, 1)" stroke-miterlimit="10" pointer-events="all"/><g transform="translate(-0.5 -0.5)"><switch><foreignObject style="overflow: visible; text-align: left;" pointer-events="none" width="100%" height="100%" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: flex; align-items: unsafe center; justify-content: unsafe center; width: 1px; height: 1px; padding-top: 187px; margin-left: 148px;"><div style="box-sizing: border-box; font-size: 0px; text-align: center;" data-drawio-colors="color: rgba(0, 0, 0, 1); background-color: rgba(255, 255, 255, 1); "><div style="display: inline-block; font-size: 11px; font-family: Helvetica; color: rgb(0, 0, 0); line-height: 1.2; pointer-events: all; background-color: rgb(255, 255, 255); white-space: nowrap;">writes</div></div></div></foreignObject><text x="148" y="190" fill="rgba(0, 0, 0, 1)" font-family="Helvetica" font-size="11px" text-anchor="middle">writes</text></switch></g><path d="M 120 225 L 45 225 L 45 166.37" fill="none" stroke="rgba(0, 0, 0, 1)" stroke-miterlimit="10" pointer-events="stroke"/><path d="M 45 161.12 L 48.5 168.12 L 45 166.37 L 41.5 168.12 Z" fill="rgba(0, 0, 0, 1)" stroke="rgba(0, 0, 0, 1)" stroke-miterlimit="10" pointer-events="all"/><g transform="translate(-0.5 -0.5)"><switch><foreignObject style="overflow: visible; text-align: left;" pointer-events="none" width="100%" height="100%" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: flex; align-items: unsafe center; justify-content: unsafe center; width: 1px; height: 1px; padding-top: 191px; margin-left: 46px;"><div style="box-sizing: border-box; font-size: 0px; text-align: center;" data-drawio-colors="color: rgba(0, 0, 0, 1); background-color: rgba(255, 255, 255, 1); "><div style="display: inline-block; font-size: 11px; font-family: Helvetica; color: rgb(0, 0, 0); line-height: 1.2; pointer-events: all; background-color: rgb(255, 255, 255); white-space: nowrap;">writes</div></div></div></foreignObject><text x="46" y="194" fill="rgba(0, 0, 0, 1)" font-family="Helvetica" font-size="11px" text-anchor="middle">writes</text></switch></g><path d="M 240 225 L 295 225 L 295 166.37" fill="none" stroke="rgba(0, 0, 0, 1)" stroke-miterlimit="10" pointer-events="stroke"/><path d="M 295 161.12 L 298.5 168.12 L 295 166.37 L 291.5 168.12 Z" fill="rgba(0, 0, 0, 1)" stroke="rgba(0, 0, 0, 1)" stroke-miterlimit="10" pointer-events="all"/><g transform="translate(-0.5 -0.5)"><switch><foreignObject style="overflow: visible; text-align: left;" pointer-events="none" width="100%" height="100%" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: flex; align-items: unsafe center; justify-content: unsafe center; width: 1px; height: 1px; padding-top: 191px; margin-left: 296px;"><div style="box-sizing: border-box; font-size: 0px; text-align: center;" data-drawio-colors="color: rgba(0, 0, 0, 1); background-color: rgba(255, 255, 255, 1); "><div style="display: inline-block; font-size: 11px; font-family: Helvetica; color: rgb(0, 0, 0); line-height: 1.2; pointer-events: all; background-color: rgb(255, 255, 255); white-space: nowrap;">writes</div></div></div></foreignObject><text x="296" y="194" fill="rgba(0, 0, 0, 1)" font-family="Helvetica" font-size="11px" text-anchor="middle">writes</text></switch></g><path d="M 240 240 L 415 240 L 415 166.37" fill="none" stroke="rgba(0, 0, 0, 1)" stroke-miterlimit="10" pointer-events="stroke"/><path d="M 415 161.12 L 418.5 168.12 L 415 166.37 L 411.5 168.12 Z" fill="rgba(0, 0, 0, 1)" stroke="rgba(0, 0, 0, 1)" stroke-miterlimit="10" pointer-events="all"/><g transform="translate(-0.5 -0.5)"><switch><foreignObject style="overflow: visible; text-align: left;" pointer-events="none" width="100%" height="100%" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: flex; align-items: unsafe center; justify-content: unsafe center; width: 1px; height: 1px; padding-top: 191px; margin-left: 416px;"><div style="box-sizing: border-box; font-size: 0px; text-align: center;" data-drawio-colors="color: rgba(0, 0, 0, 1); background-color: rgba(255, 255, 255, 1); "><div style="display: inline-block; font-size: 11px; font-family: Helvetica; color: rgb(0, 0, 0); line-height: 1.2; pointer-events: all; background-color: rgb(255, 255, 255); white-space: nowrap;">writes</div></div></div></foreignObject><text x="416" y="194" fill="rgba(0, 0, 0, 1)" font-family="Helvetica" font-size="11px" text-anchor="middle">writes</text></switch></g><rect x="120" y="210" width="120" height="60" fill="rgba(255, 255, 255, 1)" stroke="rgba(0, 0, 0, 1)" pointer-events="all"/><g transform="translate(-0.5 -0.5)"><switch><foreignObject style="overflow: visible; text-align: left;" pointer-events="none" width="100%" height="100%" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: flex; align-items: unsafe center; justify-content: unsafe center; width: 118px; height: 1px; padding-top: 240px; margin-left: 121px;"><div style="box-sizing: border-box; font-size: 0px; text-align: center;" data-drawio-colors="color: rgba(0, 0, 0, 1); "><div style="display: inline-block; font-size: 12px; font-family: Helvetica; color: rgb(0, 0, 0); line-height: 1.2; pointer-events: all; white-space: normal; overflow-wrap: normal;">Kafka</div></div></div></foreignObject><text x="180" y="244" fill="rgba(0, 0, 0, 1)" font-family="Helvetica" font-size="12px" text-anchor="middle">Kafka</text></switch></g><rect x="0" y="100" width="90" height="60" fill="rgba(255, 255, 255, 1)" stroke="rgba(0, 0, 0, 1)" pointer-events="all"/><g transform="translate(-0.5 -0.5)"><switch><foreignObject style="overflow: visible; text-align: left;" pointer-events="none" width="100%" height="100%" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: flex; align-items: unsafe center; justify-content: unsafe center; width: 88px; height: 1px; padding-top: 130px; margin-left: 1px;"><div style="box-sizing: border-box; font-size: 0px; text-align: center;" data-drawio-colors="color: rgba(0, 0, 0, 1); "><div style="display: inline-block; font-size: 12px; font-family: Helvetica; color: rgb(0, 0, 0); line-height: 1.2; pointer-events: all; white-space: normal; overflow-wrap: normal;">Lucene</div></div></div></foreignObject><text x="45" y="134" fill="rgba(0, 0, 0, 1)" font-family="Helvetica" font-size="12px" text-anchor="middle">Lucene</text></switch></g><rect x="250" y="100" width="90" height="60" fill="rgba(255, 255, 255, 1)" stroke="rgba(0, 0, 0, 1)" pointer-events="all"/><g transform="translate(-0.5 -0.5)"><switch><foreignObject style="overflow: visible; text-align: left;" pointer-events="none" width="100%" height="100%" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: flex; align-items: unsafe center; justify-content: unsafe center; width: 88px; height: 1px; padding-top: 130px; margin-left: 251px;"><div style="box-sizing: border-box; font-size: 0px; text-align: center;" data-drawio-colors="color: rgba(0, 0, 0, 1); "><div style="display: inline-block; font-size: 12px; font-family: Helvetica; color: rgb(0, 0, 0); line-height: 1.2; pointer-events: all; white-space: normal; overflow-wrap: normal;">BigQuery</div></div></div></foreignObject><text x="295" y="134" fill="rgba(0, 0, 0, 1)" font-family="Helvetica" font-size="12px" text-anchor="middle">BigQuery</text></switch></g><rect x="370" y="100" width="90" height="60" fill="rgba(255, 255, 255, 1)" stroke="rgba(0, 0, 0, 1)" pointer-events="all"/><g transform="translate(-0.5 -0.5)"><switch><foreignObject style="overflow: visible; text-align: left;" pointer-events="none" width="100%" height="100%" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: flex; align-items: unsafe center; justify-content: unsafe center; width: 88px; height: 1px; padding-top: 130px; margin-left: 371px;"><div style="box-sizing: border-box; font-size: 0px; text-align: center;" data-drawio-colors="color: rgba(0, 0, 0, 1); "><div style="display: inline-block; font-size: 12px; font-family: Helvetica; color: rgb(0, 0, 0); line-height: 1.2; pointer-events: all; white-space: normal; overflow-wrap: normal;">DataDog</div></div></div></foreignObject><text x="415" y="134" fill="rgba(0, 0, 0, 1)" font-family="Helvetica" font-size="12px" text-anchor="middle">DataDog</text></switch></g></g><switch><g requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"/><a transform="translate(0,-5)" xlink:href="https://www.diagrams.net/doc/faq/svg-export-text-problems" target="_blank"><text text-anchor="middle" font-size="10px" x="50%" y="100%">Viewer does not support full SVG 1.1</text></a></switch></svg>
</center>
http://activitystrea.ms/schema/1.0/postWebhook Failure Scenarios2023-05-28T00:00:00+00:002023-05-28T00:00:00+00:00Herman J. Radtke IIIhttp://activitystrea.ms/schema/1.0/personhttps://hermanradtke.com/about/https://hermanradtke.com/webhook-failure-scenarios/<p>A webhooks allow two applications to communicate events. It is relatively simple to get started using webhooks using HTTP and JSON. However, there a number of failure scenarios that developers should be aware of in order to make their webhook implementation robust.</p>
<p>A webhooks allow two applications to communicate events. It is relatively simple to get started using webhooks using HTTP and JSON. However, there a number of failure scenarios that developers should be aware of in order to make their webhook implementation robust.</p>
<span id="continue-reading"></span><h2 id="terminology"><a class="zola-anchor" href="#terminology" aria-label="Anchor link for: terminology">Terminology</a></h2>
<p>For this post we are using the following terms:</p>
<ul>
<li>client - A program that sends a webhook request.</li>
<li>origin - A service that processes and responds to webhook requests.</li>
</ul>
<h2 id="unhealthy-origin"><a class="zola-anchor" href="#unhealthy-origin" aria-label="Anchor link for: unhealthy-origin">Unhealthy Origin</a></h2>
<p>The client sends a valid request but receives a 500 HTTP response code (or any 5xx server error) from the origin. This is the most common failure mode. The cause of the unhealthy origin may be do to an explicit change, such as deploying a new version of the origin service. The error may also be caused by an unhealthy service upstream of the origin, such as a databased used by the origin becomming unavailable.</p>
<pre class="mermaid">
sequenceDiagram
participant client
participant origin
client ->> origin: POST /webhook
origin -->> client: 500 Internal Server Error
</pre>
<h2 id="invalid-request"><a class="zola-anchor" href="#invalid-request" aria-label="Anchor link for: invalid-request">Invalid Request</a></h2>
<p>The client sends a request that does not conform to the origin's specification. The may be due to incorrect or missing data from the HTTP body or the HTTP headers. For example: the request may be missing a <code>Content-Type: application/json</code> header or is sending a boolean as a string in the body <code>{ "active": "true" }</code> instead of <code>{ "active": true }</code>. This error may not always be the fault of the client. The origin may have been updated in such a way that that previous requests were successful, but new requests are not. For example: the origin did not require a <code>Content-Type</code> header but now does. Or the origin used to accept both <code>"true"</code> and <code>true</code> as valid boolean value but is now more strict.</p>
<pre class="mermaid">
sequenceDiagram
participant client
participant origin
client ->> origin: POST /webhook
origin -->> client: 400 Bad Request
</pre>
<h2 id="network-error"><a class="zola-anchor" href="#network-error" aria-label="Anchor link for: network-error">Network Error</a></h2>
<p>The client sends a request to the origin but the request never reaches the origin. Network errors can happen whether we are sending the request over the internet or using a VPN tunnel. Modern networks are complex and there are many reasons why the network can fail:</p>
<ul>
<li>dns - When sending a webhook request using an URI (domain name), the client may not be able to determine the IP address. Example: the client is unable to resolve an IP address for the request URI.</li>
<li>tcp - The network protocol underlying HTTP failed. Example: the origin has an unrecoverable error and issues a TCP reset.</li>
<li>http - The error occurred within the HTTP protocol. Example: the client sent the request using HTTP/2 but the origin only understands HTTP/1. </li>
</ul>
<p>The cause of the network failure may be related to the client, the origin or somewhere in between. For example: requests are normally sent over the internet which involves transiting through Comcast's network. If Comcast has an issue, the requests will be dropped until network operators re-route traffic to another provider, such as Level3.</p>
<pre class="mermaid">
sequenceDiagram
participant client
participant origin
client --x origin: POST /webhook
</pre>
<p>A network timeout error is a special case that we discuss below.</p>
<h2 id="origin-timeout"><a class="zola-anchor" href="#origin-timeout" aria-label="Anchor link for: origin-timeout">Origin Timeout</a></h2>
<p>The client sends a request to the origin, but a timeout occurs. The timeout may be from the origin, often in the form of a 504 Gateway Timeout error. The client may also timeout after a certain amount of time.</p>
<p>According to the client, a timeout is an error. However, the origin may have partially or completely processed the request. This makes it difficult for the client to know if it should re-send the request or not.</p>
<p>To understand why the origin would continue processing the request when a timeout occurs, we need to understand how HTTP requests are normally processed. Let us use an example where the origin receives a webhook request for updating inventory. When the origin receives that request, it will update a SQL database with the new inventory count. The origin will wait until the SQL update finishes and then send a response. If that update is blocked for a long time, the client may give up and timeout. The update finally succeeds and the origin attempts to send a response, however the client has already given up and the response cannot be sent. In this scenario, the client has recorded a timeout error but the origin has processed the request.</p>
<pre class="mermaid">
sequenceDiagram
participant client
box origin network
participant origin
participant database
end
client ->> origin: POST /webhook
origin ->>database: update
alt timeout
origin -->> client: 504 Gateway Timeout
database -->> origin: update finished
origin --x client: attempt to respond
end
</pre>
<h2 id="origin-dropped-request"><a class="zola-anchor" href="#origin-dropped-request" aria-label="Anchor link for: origin-dropped-request">Origin Dropped Request</a></h2>
<p>The origin responds with a 2xx HTTP response code but does not actually process the request. This is one of the more insidious failure modes because it is almost impossible for the client to detect. Worse still, this issue is normally discovered days or weeks after it first occurs.</p>
<p>Many origins will synchronously process webhook requests. However, some origins choose to asynchronously process requests. This means the origin receives a request, writes a message to a queue for later processing and responds with a 2xx HTTP response code. The intention is for some other service to process the message in the queue. Unfortunately, the message may never get processed. The queue may drop the message, a service may encounter an error while trying to process the message or a service may fail to properly process the message and mistakenly mark the message as processed.</p>
<p>The false-positive of a successful response and the delay in detecting this issue makes this scenario one of the more challenging failure modes to resolve. Examples:</p>
<ul>
<li>The client may no longer have access to the data to re-send the request.</li>
<li>Some requests cannot be safely re-sent. For example: an old inventory update request should not be sent because a newer request already updated the inventory to the correct value.</li>
</ul>
<pre class="mermaid">
sequenceDiagram
participant client
box origin network
participant origin
participant queue
participant worker
end
client ->> origin: POST /webhook
origin ->>queue: save
origin -->> client: 200 OK
alt failure
worker ->> queue: get next
queue -->> worker: message
worker ->> queue: error
end
</pre>
<h2 id="simple-not-easy"><a class="zola-anchor" href="#simple-not-easy" aria-label="Anchor link for: simple-not-easy">Simple, Not Easy</a></h2>
<p>The concept of webhooks are simple, but it is not easy to properly implement webhooks in a way that is robust and dependable. If the above scenarios are not properly handled, a webhook implementation is at risk of data loss and will require a support team to resolve incidents. Some failure scenarios are not possible to defend against without first documenting the interface, including pre-conditions and post-conditions, that both the origin and client should adhere to.</p>
<p>I am building a webhook proxy, called <a href="https://github.com/hjr3/soldr#soldr">soldr</a>, to make webhook implementations resilient to the failure scenarios discussed above.</p>
http://activitystrea.ms/schema/1.0/postSend UDP Messages in Node.js Without dns.lookup2022-10-17T00:00:00+00:002022-10-17T00:00:00+00:00Herman J. Radtke IIIhttp://activitystrea.ms/schema/1.0/personhttps://hermanradtke.com/about/https://hermanradtke.com/send-udp-messages-in-nodejs-without-dns-lookup/<p>At work, I recently inherited a node service that was sending metrics to DataDog using the <a href="https://github.com/brightcove/hot-shots">brightcove/hot-shots</a> StatsD client. While investigating some issues with <code>dns.lookup</code>, I noticed other people had run into this same issue but there was no one sharing what a solution might look like.</p>
<p>At work, I recently inherited a node service that was sending metrics to DataDog using the <a href="https://github.com/brightcove/hot-shots">brightcove/hot-shots</a> StatsD client. While investigating some issues with <code>dns.lookup</code>, I noticed other people had run into this same issue but there was no one sharing what a solution might look like.</p>
<span id="continue-reading"></span>
<p><em>Note: This post was significantly edited on November 6, 2022.</em></p>
<p>In a hurry? You can <a href="https://hermanradtke.com/send-udp-messages-in-nodejs-without-dns-lookup/#preventing-dns-lookup-in-hot-shots-statsd-client">skip</a> to the solution.</p>
<h2 id="dns-lookup-is-always-called"><a class="zola-anchor" href="#dns-lookup-is-always-called" aria-label="Anchor link for: dns-lookup-is-always-called"><code>dns.lookup</code> Is Always Called</a></h2>
<p>Let us create a simple program to send a message via UDP. We <em>can</em> use a domain name with <code>node:dgram</code>, but it is bad idea. I explain why <a href="https://hermanradtke.com/send-udp-messages-in-nodejs-without-dns-lookup/#addendum-avoid-domain-names">here</a>. Let us assume we have a single IP address instead.</p>
<pre data-lang="js" style="background-color:#2b303b;color:#c0c5ce;" class="language-js "><code class="language-js" data-lang="js"><span style="color:#b48ead;">import </span><span style="color:#bf616a;">dgram </span><span style="color:#b48ead;">from </span><span>'</span><span style="color:#a3be8c;">node:dgram</span><span>';
</span><span style="color:#b48ead;">import </span><span style="color:#bf616a;">dns </span><span style="color:#b48ead;">from </span><span>'</span><span style="color:#a3be8c;">node:dns</span><span>';
</span><span>
</span><span style="color:#b48ead;">const </span><span style="color:#bf616a;">originalLookup </span><span>= </span><span style="color:#bf616a;">dns</span><span>.</span><span style="color:#bf616a;">lookup</span><span>;
</span><span style="color:#bf616a;">dns</span><span>.</span><span style="color:#8fa1b3;">lookup </span><span>= (...</span><span style="color:#bf616a;">args</span><span>) </span><span style="color:#b48ead;">=> </span><span>{
</span><span> </span><span style="color:#ebcb8b;">console</span><span>.</span><span style="color:#96b5b4;">log</span><span>('</span><span style="color:#a3be8c;">called dns.lookup</span><span>');
</span><span> </span><span style="color:#8fa1b3;">originalLookup</span><span>(...</span><span style="color:#bf616a;">args</span><span>);
</span><span>};
</span><span>
</span><span style="color:#b48ead;">const </span><span style="color:#bf616a;">ip </span><span>= '</span><span style="color:#a3be8c;">93.184.216.34</span><span>'; </span><span style="color:#65737e;">// www.example.com
</span><span style="color:#b48ead;">const </span><span style="color:#bf616a;">socket </span><span>= </span><span style="color:#bf616a;">dgram</span><span>.</span><span style="color:#8fa1b3;">createSocket</span><span>('</span><span style="color:#a3be8c;">udp4</span><span>');
</span><span>
</span><span style="color:#bf616a;">socket</span><span>.</span><span style="color:#96b5b4;">send</span><span>('</span><span style="color:#a3be8c;">foo</span><span>', </span><span style="color:#d08770;">8125</span><span>, </span><span style="color:#bf616a;">ip</span><span>, (</span><span style="color:#bf616a;">err</span><span>) </span><span style="color:#b48ead;">=> </span><span>{
</span><span> </span><span style="color:#bf616a;">socket</span><span>.</span><span style="color:#96b5b4;">close</span><span>();
</span><span>});
</span></code></pre>
<p>When we run this program, we expect to bypass all calls to <code>dns.lookup</code> when we run our code.</p>
<pre style="background-color:#2b303b;color:#c0c5ce;"><code><span>$ node udp.mjs
</span><span>called dns.lookup
</span><span>called dns.lookup
</span></code></pre>
<p>This is surprising and we are not the only ones who think so. This behavior was first called out in <a href="https://github.com/nodejs/node/issues/35130">nodejs/node#35130</a> but was dismissed with a <em>won't fix</em> response. It was brought up again in <a href="https://github.com/nodejs/node/issues/39468">nodejs/node#39468</a> because (as the docs said above), we are still delayed by at least one tick of the event loop as shown in <a href="https://github.com/nodejs/node/blob/b3723fac05aa86a4e0604e218dbd8ae24609172b/lib/dns.js#L155-L164">b3723fac05</a>.</p>
<h2 id="avoiding-dns-lookup-when-using-ip-address"><a class="zola-anchor" href="#avoiding-dns-lookup-when-using-ip-address" aria-label="Anchor link for: avoiding-dns-lookup-when-using-ip-address">Avoiding <code>dns.lookup</code> When Using IP Address</a></h2>
<p>To avoid <code>dns.lookup</code>, we configure our socket to use a custom lookup function.</p>
<pre data-lang="js" style="background-color:#2b303b;color:#c0c5ce;" class="language-js "><code class="language-js" data-lang="js"><span style="color:#b48ead;">const </span><span style="color:#bf616a;">socket </span><span>= </span><span style="color:#bf616a;">dgram</span><span>.</span><span style="color:#8fa1b3;">createSocket</span><span>({
</span><span> type: '</span><span style="color:#a3be8c;">udp4</span><span>',
</span><span> </span><span style="color:#8fa1b3;">lookup</span><span>: (</span><span style="color:#bf616a;">hostname</span><span>, </span><span style="color:#bf616a;">_options</span><span>, </span><span style="color:#bf616a;">callback</span><span>) </span><span style="color:#b48ead;">=> </span><span>{
</span><span> </span><span style="color:#8fa1b3;">callback</span><span>(</span><span style="color:#d08770;">null</span><span>, </span><span style="color:#bf616a;">hostname</span><span>, '</span><span style="color:#a3be8c;">IPv4</span><span>');
</span><span> },
</span><span>});
</span></code></pre>
<p>The <code>hostname</code> value will be the value of <code>ip</code>. Now, when we run it we will not see any calls made to <code>dns.lookup</code>.</p>
<pre style="background-color:#2b303b;color:#c0c5ce;"><code><span>$ node udp.mjs
</span></code></pre>
<h3 id="the-value-of-hostname-is-not-always-what-we-expect"><a class="zola-anchor" href="#the-value-of-hostname-is-not-always-what-we-expect" aria-label="Anchor link for: the-value-of-hostname-is-not-always-what-we-expect">The Value of <code>hostname</code> Is Not Always What We Expect</a></h3>
<p>We might consider swapping out <code>hostname</code> for <code>ip</code> in the callback, but that will cause a problem.</p>
<pre data-lang="diff" style="background-color:#2b303b;color:#c0c5ce;" class="language-diff "><code class="language-diff" data-lang="diff"><span style="color:#bf616a;">- callback(null, hostname, 'IPv4');
</span><span style="color:#a3be8c;">+ callback(null, ip, 'IPv4');
</span></code></pre>
<pre style="background-color:#2b303b;color:#c0c5ce;"><code><span>$ node udp.mjs
</span><span>node:internal/errors:484
</span><span> ErrorCaptureStackTrace(err);
</span><span> ^
</span><span>
</span><span>Error: bind EADDRNOTAVAIL 93.184.216.34
</span><span> at node:dgram:359:20
</span><span> at lookup (file:///Users/herman/Code/udp-no-dns/udp.mjs:14:5)
</span><span> at UDP.lookup4 (node:internal/dgram:24:10)
</span><span> at Socket.bind (node:dgram:325:16)
</span><span> at Socket.send (node:dgram:645:10)
</span><span> at node:internal/util:364:7
</span><span> at new Promise (<anonymous>)
</span><span> at Socket.send2 (node:internal/util:350:12)
</span><span> at file:///Users/herman/Code/udp-no-dns/udp.mjs:21:14
</span><span>Emitted 'error' event on Socket instance at:
</span><span> at node:dgram:361:14
</span><span> at lookup (file:///Users/herman/Code/udp-no-dns/udp.mjs:14:5)
</span><span> [... lines matching original stack trace ...]
</span><span> at file:///Users/herman/Code/udp-no-dns/udp.mjs:21:14 {
</span><span> errno: -49,
</span><span> code: 'EADDRNOTAVAIL',
</span><span> syscall: 'bind',
</span><span> address: '93.184.216.34'
</span><span>}
</span></code></pre>
<p>The issue is that our <code>socket.send</code> first tries to bind to a local address (e.g. <code>0.0.0.0</code>), which calls our custom lookup function. This is why our first example printed <em>called dns.lookup</em> twice: first for the local address and the second time for the <code>host</code> parameter of <code>socket.send</code>. Our custom lookup function returned <code>93.184.216.34</code> both times. The socket cannot bind to a non-local address like <code>93.184.216.34</code> and emitted an error that told us as much. Now that we know that our lookup function can be called in unexpected ways, let us change the function to bypass <code>dns.lookup</code> only when <code>hostname</code> matches our expected domain name.</p>
<p>If we want to be really safe, we can consider calling <code>dns.lookup</code> for any value of <code>hostname</code> other than <code>ip</code>.</p>
<pre data-lang="js" style="background-color:#2b303b;color:#c0c5ce;" class="language-js "><code class="language-js" data-lang="js"><span style="color:#b48ead;">const </span><span style="color:#bf616a;">socket </span><span>= </span><span style="color:#bf616a;">dgram</span><span>.</span><span style="color:#8fa1b3;">createSocket</span><span>({
</span><span> type: '</span><span style="color:#a3be8c;">udp4</span><span>',
</span><span> </span><span style="color:#8fa1b3;">lookup</span><span>: (</span><span style="color:#bf616a;">hostname</span><span>, </span><span style="color:#bf616a;">options</span><span>, </span><span style="color:#bf616a;">callback</span><span>) </span><span style="color:#b48ead;">=> </span><span>{
</span><span> </span><span style="color:#b48ead;">if </span><span>(</span><span style="color:#bf616a;">hostname </span><span>=== </span><span style="color:#bf616a;">ip</span><span>) {
</span><span> </span><span style="color:#8fa1b3;">callback</span><span>(</span><span style="color:#d08770;">null</span><span>, </span><span style="color:#bf616a;">ip</span><span>, '</span><span style="color:#a3be8c;">IPv4</span><span>');
</span><span> </span><span style="color:#b48ead;">return</span><span>;
</span><span> }
</span><span>
</span><span> </span><span style="color:#bf616a;">dns</span><span>.</span><span style="color:#8fa1b3;">lookup</span><span>(</span><span style="color:#bf616a;">hostname</span><span>, </span><span style="color:#bf616a;">options</span><span>, </span><span style="color:#bf616a;">callback</span><span>);
</span><span> },
</span><span>});
</span></code></pre>
<h2 id="preventing-dns-lookup-in-hot-shots-statsd-client"><a class="zola-anchor" href="#preventing-dns-lookup-in-hot-shots-statsd-client" aria-label="Anchor link for: preventing-dns-lookup-in-hot-shots-statsd-client">Preventing DNS Lookup in hot-shots StatsD Client</a></h2>
<p>Now that we know about custom lookup functions, we can apply this same approach to the hot-shots StatsD client. A recent patch made it possible to pass UDP socket options when creating the client. Since <a href="https://github.com/brightcove/hot-shots/commit/a399dda99fb1bf2b15e53646b3ef5d8cbb0b90c9">a399dda</a> landed in <code>v9.2.0</code> you can do:</p>
<pre style="background-color:#2b303b;color:#c0c5ce;"><code><span>const client = new StatsD({
</span><span> host,
</span><span> port,
</span><span> udpSocketOptions: {
</span><span> type: 'udp4',
</span><span> lookup: (hostname, options, callback) => {
</span><span> // our program above
</span><span> },
</span><span> },
</span><span>});
</span></code></pre>
<h2 id="addendum-avoid-domain-names"><a class="zola-anchor" href="#addendum-avoid-domain-names" aria-label="Anchor link for: addendum-avoid-domain-names">Addendum: Avoid Domain Names</a></h2>
<p>We prefer UDP for sending data like metrics because it is fast. We do not want the overhead of TCP and we are fine dropping some connections. When using a domain name, the docs warn us:</p>
<blockquote>
<p>DNS lookups delay the time to send for at least one tick of the Node.js event loop.</p>
</blockquote>
<p>Depending on how fast our DNS server is, we may be delayed for much longer than one tick of the event loop. However, things actually get worse. In <a href="https://nodejs.org/api/dns.html#dnslookup">Implementation considerations</a>, we are warned:</p>
<blockquote>
<p>Though the call to dns.lookup() will be asynchronous from JavaScript's perspective, it is implemented as a synchronous call to getaddrinfo(3) that runs on libuv's threadpool. This can have surprising negative performance implications for some applications, see the UV_THREADPOOL_SIZE documentation for more information.</p>
</blockquote>
<p>The <code>getaddrinfo</code> function is written in C. It is a blocking function, which would cause problems for our event loop. To prevent blocking, the call to <code>getaddrinfo</code> is made using an internal threadpool. From <a href="https://nodejs.org/api/cli.html#uv_threadpool_sizesize">UV_THREADPOOL_SIZE</a>:</p>
<blockquote>
<p>Because libuv's threadpool has a fixed size, it means that if for whatever reason any of these APIs takes a long time, other (seemingly unrelated) APIs that run in libuv's threadpool will experience degraded performance.</p>
</blockquote>
<p>If we are sending a lot of UDP messages, we absolutely do not want to be using domain names.</p>
<h3 id="dns-caching"><a class="zola-anchor" href="#dns-caching" aria-label="Anchor link for: dns-caching">DNS Caching</a></h3>
<p>We may be forced to use a domain name if the IP address (or addresses) change. In that case, our best bet is to use some sort of DNS cache. Choosing a proper implementation is for another post. However, once we decide on an cache implementation, we can combine the DNS cache with our custom lookup function to avoid calling <code>dns.lookup</code>.</p>
http://activitystrea.ms/schema/1.0/postLanding Page Router Using Fastly Compute@Edge and WASM2022-04-24T00:00:00+00:002022-04-24T00:00:00+00:00Herman J. Radtke IIIhttp://activitystrea.ms/schema/1.0/personhttps://hermanradtke.com/about/https://hermanradtke.com/landing-page-router-using-fastly-edge-compute-wasm/<p>A company often has a landing page for first time visitors that is optimized for describing and educating that person on what product or service the company is offering. This page is usually not useful for people already familiar with the company. Ideally, a new user would see the marketing landing page and the returning user would see a more functional page. There are two common approaches to solving this problem that both have pitfalls. I want to explore a third option using Fastly's Compute@Edge offering.</p>
<p>A company often has a landing page for first time visitors that is optimized for describing and educating that person on what product or service the company is offering. This page is usually not useful for people already familiar with the company. Ideally, a new user would see the marketing landing page and the returning user would see a more functional page. There are two common approaches to solving this problem that both have pitfalls. I want to explore a third option using Fastly's Compute@Edge offering.</p>
<span id="continue-reading"></span>
<p>Here are the two common approaches to serving different content when a user visits the company website:</p>
<ol>
<li>A user browses to the index page of our website. That HTTP request is sent to our server and some server side language (Node.JS, PHP, Python, Ruby, etc) checks if some cookie exists and serves the appropriate page. The problem is that we can now no longer cache this page. Also, landing pages are usually static. It would be nice not use a static landing page or at least one that can be cached for a very long time.</li>
<li>Another option is to serve always the same landing page user browses to the index page of our website. We can cache this page at the edge for a very long time. Once the page loads, we can use JavaScript to check if some cookie exists and redirect the user to the more functional page. The problem is that a returning user will often see the marketing landing page flicker before they are redirected to the more functional page. Even if we put the JavaScript high up in the <code><head></code> element and try to prevent the flicker, we have a more subtle problem. We responded with a bunch of content that we immediately through away. This is wasteful and if we have a lot of people visiting our website, the bandwidth adds up.</li>
</ol>
<p>A third option I want to explore is to perform this logic in the CDN itself. A user browses to the index page of our website and that HTTP request first goes to our CDN. We have some compute that checks for some cookie and serves the appropriate page. Since we are still within the CDN, those pages are served from the CDN cache. We are going to use Fastly as our CDN. Now, we could do this check in Fastly's VCL itself. However, VCL is hard to dev and test. Let us explore what we can do with Rust and Compute@Edge.</p>
<p>This is going to require some set up. We will need a Fastly account and a website to serve as a backend. Fastly has a limited free plan, but it should be good enough. I will use my personal website as the backend. First time visitors (no cookie) will see the index page. Returning visitors will see the <a href="https://hermanradtke.com/tags/rustlang/">Tag: #rustlang</a> page.</p>
<p>Go to
Start here: https://developer.fastly.com/learning/compute/
Go to https://manage.fastly.com/compute/
Add my domain lpr.hermanradtke.com
This is the domain fastly will use
We need to create the CNAME record. Let us skip TLS for right now.
Verify it using <code>dig lpr.hermanradtke.com +short</code>
I made sure to name my service so it was easy to identify later
Now create a token
Go to https://manage.fastly.com/account/personal/tokens
Set global scope
I set to never expire because this is a simple demo
Save and store it securely
Now install $ brew install fastly/tap/fastly
I opted to set up a fastly profile so I would not have to use -t or an env var</p>
<p>Finally, we can start coding.</p>
<p>cd /path/to/Code
mkdir landing-page-router
cd !$
fastly compute init
choose option</p>
<pre style="background-color:#2b303b;color:#c0c5ce;"><code><span>fastly compute init
</span><span>
</span><span>Creating a new Compute@Edge project.
</span><span>
</span><span>Press ^C at any time to quit.
</span><span>
</span><span>Name: [landing-page-router]
</span><span>Description:
</span><span>Author: [herman@hermanradtke.com]
</span><span>Language:
</span><span>[1] Rust
</span><span>[2] JavaScript
</span><span>[3] AssemblyScript (beta)
</span><span>[4] Other ('bring your own' Wasm binary)
</span><span>Choose option: [1] 1
</span><span>Starter kit:
</span><span>[1] Default starter for Rust
</span><span> A basic starter kit that demonstrates routing, simple synthetic responses and
</span><span> overriding caching rules.
</span><span> https://github.com/fastly/compute-starter-kit-rust-default
</span><span>[2] Authenticate at edge with OAuth
</span><span> Connect to an identity provider such as Auth0 using OAuth 2.0 and validate
</span><span> authentication status at the Edge, to authorize access to your edge or origin hosted
</span><span> applications.
</span><span> https://github.com/fastly/compute-rust-auth
</span><span>[3] Beacon termination
</span><span> Capture beacon data from the browser, divert beacon request payloads to a log
</span><span> endpoint, and avoid putting load on your own infrastructure.
</span><span> https://github.com/fastly/compute-starter-kit-rust-beacon-termination
</span><span>[4] Empty starter for Rust
</span><span> An empty starter kit project template.
</span><span> https://github.com/fastly/compute-starter-kit-rust-empty
</span><span>[5] Static content
</span><span> Apply performance, security and usability upgrades to static bucket services such as
</span><span> Google Cloud Storage or AWS S3.
</span><span> https://github.com/fastly/compute-starter-kit-rust-static-content
</span><span>Choose option or paste git URL: [1] 4
</span><span>
</span><span>✓ Initializing...
</span><span>✓ Fetching package template...
</span><span>✓ Updating package manifest...
</span><span>✓ Initializing package...
</span><span>
</span><span>Initialized package landing-page-router to:
</span><span> /Users/herman/Code/landing-page-router
</span><span>
</span><span>To publish the package (build and deploy), run:
</span><span> fastly compute publish
</span><span>
</span><span>To learn about deploying Compute@Edge projects using third-party orchestration tools, visit:
</span><span> https://developer.fastly.com/learning/integrations/orchestration/
</span><span>
</span><span>
</span><span>SUCCESS: Initialized package landing-page-router
</span></code></pre>
<p>Now we are supposed to run <code>$ fastly compute build</code> to verify. Great that works. But wait, I want to use familiar tools. Let us see if <code>cargo check</code> still works. It does.</p>
<p>Let us now deploy to make sure this simple example works.</p>
<pre style="background-color:#2b303b;color:#c0c5ce;"><code><span>fastly compute deploy
</span><span>
</span><span>There is no Fastly service associated with this package. To connect to an existing service
</span><span>add the Service ID to the fastly.toml file, otherwise follow the prompts to create a
</span><span>service now.
</span><span>
</span><span>Press ^C at any time to quit.
</span><span>
</span><span>Create new service: [y/N] N
</span></code></pre>
<p>I stop because we already made a service. Let us find the service id. <code>fastly service list</code> and we can now add the id into the <code>fastly.toml</code> file.</p>
<p>But now we need to tell Fastly to send requests to our backend</p>
<p><code>fastly backend create --version=2 --name="Blog" --address="hermanradtke.com" --use-ssl</code>
<code>fastly backend describe --version=latest --name="Blog"</code>
<code>fastly service-version activate --version=latest</code>
<code>fastly domain validate --name=lpr.hermanradtke.com --version=active</code></p>
http://activitystrea.ms/schema/1.0/postHow To Mock Functions That Have External HTTP Requests2022-04-23T00:00:00+00:002022-04-23T00:00:00+00:00Herman J. Radtke IIIhttp://activitystrea.ms/schema/1.0/personhttps://hermanradtke.com/about/https://hermanradtke.com/how-to-mock-functions-that-have-external-http-requests/<p>When writing tests, we do not want to hit the external API each time we run our tests. If we are coming from a dynamic language, such as Node.JS, we may want to a solution like <a href="https://www.npmjs.com/package/fetch-mock">fetch-mock</a> which will patch the implementation of <code>fetch</code> at runtime. This is not practical in Rust. There are some attempts, like the <a href="https://github.com/Shizcow/hotpatch">hotpatch</a> crate, but we will use a different strategy.</p>
<p>When writing tests, we do not want to hit the external API each time we run our tests. If we are coming from a dynamic language, such as Node.JS, we may want to a solution like <a href="https://www.npmjs.com/package/fetch-mock">fetch-mock</a> which will patch the implementation of <code>fetch</code> at runtime. This is not practical in Rust. There are some attempts, like the <a href="https://github.com/Shizcow/hotpatch">hotpatch</a> crate, but we will use a different strategy.</p>
<span id="continue-reading"></span>
<p>The complete code for this post can be found at: <a href="https://github.com/hjr3/the-cat-api-http-mocks">https://github.com/hjr3/the-cat-api-http-mocks</a></p>
<h2 id="calling-the-cat-api"><a class="zola-anchor" href="#calling-the-cat-api" aria-label="Anchor link for: calling-the-cat-api">Calling The Cat API</a></h2>
<p>Let us start with an example. We will write a program to make search for cat breeds using <a href="https://docs.thecatapi.com/">The Cat API</a>. First, let us discover how this API works. Reading the docs for <a href="https://docs.thecatapi.com/api-reference/breeds/breeds-search">GET /breeds/search</a> we can search for breeds using the <code>q</code> query parameter. Using curl, we can try this out:</p>
<pre style="background-color:#2b303b;color:#c0c5ce;"><code><span>$ curl https://api.thecatapi.com/v1/breeds/search?q=sib | jq
</span></code></pre>
<pre data-lang="json" style="background-color:#2b303b;color:#c0c5ce;" class="language-json "><code class="language-json" data-lang="json"><span>[
</span><span> {
</span><span> "</span><span style="color:#a3be8c;">weight</span><span>": {
</span><span> "</span><span style="color:#a3be8c;">imperial</span><span>": "</span><span style="color:#a3be8c;">8 - 16</span><span>",
</span><span> "</span><span style="color:#a3be8c;">metric</span><span>": "</span><span style="color:#a3be8c;">4 - 7</span><span>"
</span><span> },
</span><span> "</span><span style="color:#a3be8c;">id</span><span>": "</span><span style="color:#a3be8c;">sibe</span><span>",
</span><span> "</span><span style="color:#a3be8c;">name</span><span>": "</span><span style="color:#a3be8c;">Siberian</span><span>",
</span><span> "</span><span style="color:#a3be8c;">cfa_url</span><span>": "</span><span style="color:#a3be8c;">http://cfa.org/Breeds/BreedsSthruT/Siberian.aspx</span><span>",
</span><span> "</span><span style="color:#a3be8c;">vetstreet_url</span><span>": "</span><span style="color:#a3be8c;">http://www.vetstreet.com/cats/siberian</span><span>",
</span><span> "</span><span style="color:#a3be8c;">vcahospitals_url</span><span>": "</span><span style="color:#a3be8c;">https://vcahospitals.com/know-your-pet/cat-breeds/siberian</span><span>",
</span><span> "</span><span style="color:#a3be8c;">temperament</span><span>": "</span><span style="color:#a3be8c;">Curious, Intelligent, Loyal, Sweet, Agile, Playful, Affectionate</span><span>",
</span><span> "</span><span style="color:#a3be8c;">origin</span><span>": "</span><span style="color:#a3be8c;">Russia</span><span>",
</span><span> "</span><span style="color:#a3be8c;">country_codes</span><span>": "</span><span style="color:#a3be8c;">RU</span><span>",
</span><span> "</span><span style="color:#a3be8c;">country_code</span><span>": "</span><span style="color:#a3be8c;">RU</span><span>",
</span><span> "</span><span style="color:#a3be8c;">description</span><span>": "</span><span style="color:#a3be8c;">The Siberians dog like temperament and affection makes the ideal lap cat and will live quite happily indoors. Very agile and powerful, the Siberian cat can easily leap and reach high places, including the tops of refrigerators and even doors. </span><span>",
</span><span> "</span><span style="color:#a3be8c;">life_span</span><span>": "</span><span style="color:#a3be8c;">12 - 15</span><span>",
</span><span> "</span><span style="color:#a3be8c;">indoor</span><span>": </span><span style="color:#d08770;">0</span><span>,
</span><span> "</span><span style="color:#a3be8c;">lap</span><span>": </span><span style="color:#d08770;">1</span><span>,
</span><span> "</span><span style="color:#a3be8c;">alt_names</span><span>": "</span><span style="color:#a3be8c;">Moscow Semi-longhair, HairSiberian Forest Cat</span><span>",
</span><span> "</span><span style="color:#a3be8c;">adaptability</span><span>": </span><span style="color:#d08770;">5</span><span>,
</span><span> "</span><span style="color:#a3be8c;">affection_level</span><span>": </span><span style="color:#d08770;">5</span><span>,
</span><span> "</span><span style="color:#a3be8c;">child_friendly</span><span>": </span><span style="color:#d08770;">4</span><span>,
</span><span> "</span><span style="color:#a3be8c;">dog_friendly</span><span>": </span><span style="color:#d08770;">5</span><span>,
</span><span> "</span><span style="color:#a3be8c;">energy_level</span><span>": </span><span style="color:#d08770;">5</span><span>,
</span><span> "</span><span style="color:#a3be8c;">grooming</span><span>": </span><span style="color:#d08770;">2</span><span>,
</span><span> "</span><span style="color:#a3be8c;">health_issues</span><span>": </span><span style="color:#d08770;">2</span><span>,
</span><span> "</span><span style="color:#a3be8c;">intelligence</span><span>": </span><span style="color:#d08770;">5</span><span>,
</span><span> "</span><span style="color:#a3be8c;">shedding_level</span><span>": </span><span style="color:#d08770;">3</span><span>,
</span><span> "</span><span style="color:#a3be8c;">social_needs</span><span>": </span><span style="color:#d08770;">4</span><span>,
</span><span> "</span><span style="color:#a3be8c;">stranger_friendly</span><span>": </span><span style="color:#d08770;">3</span><span>,
</span><span> "</span><span style="color:#a3be8c;">vocalisation</span><span>": </span><span style="color:#d08770;">1</span><span>,
</span><span> "</span><span style="color:#a3be8c;">experimental</span><span>": </span><span style="color:#d08770;">0</span><span>,
</span><span> "</span><span style="color:#a3be8c;">hairless</span><span>": </span><span style="color:#d08770;">0</span><span>,
</span><span> "</span><span style="color:#a3be8c;">natural</span><span>": </span><span style="color:#d08770;">1</span><span>,
</span><span> "</span><span style="color:#a3be8c;">rare</span><span>": </span><span style="color:#d08770;">0</span><span>,
</span><span> "</span><span style="color:#a3be8c;">rex</span><span>": </span><span style="color:#d08770;">0</span><span>,
</span><span> "</span><span style="color:#a3be8c;">suppressed_tail</span><span>": </span><span style="color:#d08770;">0</span><span>,
</span><span> "</span><span style="color:#a3be8c;">short_legs</span><span>": </span><span style="color:#d08770;">0</span><span>,
</span><span> "</span><span style="color:#a3be8c;">wikipedia_url</span><span>": "</span><span style="color:#a3be8c;">https://en.wikipedia.org/wiki/Siberian_(cat)</span><span>",
</span><span> "</span><span style="color:#a3be8c;">hypoallergenic</span><span>": </span><span style="color:#d08770;">1</span><span>,
</span><span> "</span><span style="color:#a3be8c;">reference_image_id</span><span>": "</span><span style="color:#a3be8c;">3bkZAjRh1</span><span>"
</span><span> }
</span><span>]
</span></code></pre>
<p>Now that we know how to make a request and what the shape of the response looks like, we can write a program. We will use the reqwest crate to make our HTTP requests. I will opt for blocking behavior to avoid any async type juggling.</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">use </span><span>serde::Deserialize;
</span><span>
</span><span>#[</span><span style="color:#bf616a;">derive</span><span>(Debug, Deserialize)]
</span><span style="color:#b48ead;">struct </span><span>Breed {
</span><span> </span><span style="color:#bf616a;">id</span><span>: String,
</span><span> </span><span style="color:#bf616a;">name</span><span>: String,
</span><span>}
</span><span>
</span><span style="color:#b48ead;">type </span><span>BreedResponse = Vec<Breed>;
</span><span>
</span><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">search_breeds</span><span>(</span><span style="color:#bf616a;">query</span><span>: &</span><span style="color:#b48ead;">str</span><span>) -> Result<BreedResponse, Box<dyn std::error::Error>> {
</span><span> </span><span style="color:#b48ead;">let</span><span> url = format!("</span><span style="color:#a3be8c;">https://api.thecatapi.com/v1/breeds/search?q=</span><span style="color:#d08770;">{}</span><span>", query);
</span><span> </span><span style="color:#b48ead;">let</span><span> resp = reqwest::blocking::get(url)?.json::<BreedResponse>()?;
</span><span>
</span><span> Ok(resp)
</span><span>}
</span><span>
</span><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">main</span><span>() -> Result<(), Box<dyn std::error::Error>> {
</span><span> </span><span style="color:#b48ead;">let</span><span> resp = </span><span style="color:#96b5b4;">search_breeds</span><span>("</span><span style="color:#a3be8c;">sib</span><span>")?;
</span><span> println!("</span><span style="color:#d08770;">{:#?}</span><span>", resp);
</span><span> Ok(())
</span><span>}
</span></code></pre>
<p>Note: The serde_json crate allows us to define a subset of the response. I only specified a few fields in <code>Breed</code> for brevity.</p>
<p>If we run our program, we should see something like:</p>
<pre style="background-color:#2b303b;color:#c0c5ce;"><code><span>$ cargo run
</span><span>[
</span><span> Breed {
</span><span> id: "sibe",
</span><span> name: "Siberian",
</span><span> },
</span><span>]
</span></code></pre>
<h2 id="create-mocks-using-traits"><a class="zola-anchor" href="#create-mocks-using-traits" aria-label="Anchor link for: create-mocks-using-traits">Create Mocks Using Traits</a></h2>
<p>Now, we want to test our program without making an actual HTTP request. We can use a <a href="https://doc.rust-lang.org/book/ch10-02-traits.html">trait</a> to define the types of requests we can make. We will define a trait called <code>TheCatApi</code> and then implement a concrete <code>TheCatApiClient</code> type.</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">trait </span><span>TheCatApi {
</span><span> </span><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">search_breeds</span><span>(&</span><span style="color:#bf616a;">self</span><span>, </span><span style="color:#bf616a;">query</span><span>: &</span><span style="color:#b48ead;">str</span><span>) -> Result<BreedResponse, Box<dyn std::error::Error>>;
</span><span>}
</span><span>
</span><span style="color:#b48ead;">struct </span><span>TheCatApiClient {}
</span><span style="color:#b48ead;">impl </span><span>TheCatApi </span><span style="color:#b48ead;">for </span><span>TheCatApiClient {
</span><span> </span><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">search_breeds</span><span>(&</span><span style="color:#bf616a;">self</span><span>, </span><span style="color:#bf616a;">query</span><span>: &</span><span style="color:#b48ead;">str</span><span>) -> Result<BreedResponse, Box<dyn std::error::Error>> {
</span><span> </span><span style="color:#b48ead;">let</span><span> url = format!("</span><span style="color:#a3be8c;">https://api.thecatapi.com/v1/breeds/search?q=</span><span style="color:#d08770;">{}</span><span>", query);
</span><span> </span><span style="color:#b48ead;">let</span><span> resp = reqwest::blocking::get(url)?.json::<BreedResponse>()?;
</span><span>
</span><span> Ok(resp)
</span><span> }
</span><span>}
</span></code></pre>
<p>Now our main function looks like this:</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">main</span><span>() -> Result<(), Box<dyn std::error::Error>> {
</span><span> </span><span style="color:#b48ead;">let</span><span> client = TheCatApiClient {};
</span><span> </span><span style="color:#b48ead;">let</span><span> resp = client.</span><span style="color:#96b5b4;">search_breeds</span><span>("</span><span style="color:#a3be8c;">sib</span><span>")?;
</span><span> println!("</span><span style="color:#d08770;">{:#?}</span><span>", resp);
</span><span>
</span><span> Ok(())
</span><span>}
</span></code></pre>
<p>If we run our program, we should get the same output above.</p>
<p>Now that we have a <code>TheCatApi</code> trait, we can also implement a mock client. We use the output from our curl request above and implement the <code>search_breeds</code> function to deserialize the JSON.</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">struct </span><span>TheCatApiClientMock {}
</span><span>
</span><span style="color:#b48ead;">impl </span><span>TheCatApi </span><span style="color:#b48ead;">for </span><span>TheCatApiClientMock {
</span><span> </span><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">search_breeds</span><span>(&</span><span style="color:#bf616a;">self</span><span>, </span><span style="color:#bf616a;">_query</span><span>: &</span><span style="color:#b48ead;">str</span><span>) -> Result<BreedResponse, Box<dyn std::error::Error>> {
</span><span> </span><span style="color:#b48ead;">let</span><span> data = </span><span style="color:#b48ead;">r</span><span>#"
</span><span style="color:#a3be8c;"> [
</span><span style="color:#a3be8c;"> {
</span><span style="color:#a3be8c;"> "weight": {
</span><span style="color:#a3be8c;"> "imperial": "8 - 16",
</span><span style="color:#a3be8c;"> "metric": "4 - 7"
</span><span style="color:#a3be8c;"> },
</span><span style="color:#a3be8c;"> "id": "sibe",
</span><span style="color:#a3be8c;"> "name": "Siberian",
</span><span style="color:#a3be8c;"> "cfa_url": "http://cfa.org/Breeds/BreedsSthruT/Siberian.aspx",
</span><span style="color:#a3be8c;"> "vetstreet_url": "http://www.vetstreet.com/cats/siberian",
</span><span style="color:#a3be8c;"> "vcahospitals_url": "https://vcahospitals.com/know-your-pet/cat-breeds/siberian",
</span><span style="color:#a3be8c;"> "temperament": "Curious, Intelligent, Loyal, Sweet, Agile, Playful, Affectionate",
</span><span style="color:#a3be8c;"> "origin": "Russia",
</span><span style="color:#a3be8c;"> "country_codes": "RU",
</span><span style="color:#a3be8c;"> "country_code": "RU",
</span><span style="color:#a3be8c;"> "description": "The Siberians dog like temperament and affection makes the ideal lap cat and will live quite happily indoors. Very agile and powerful, the Siberian cat can easily leap and reach high places, including the tops of refrigerators and even doors. ",
</span><span style="color:#a3be8c;"> "life_span": "12 - 15",
</span><span style="color:#a3be8c;"> "indoor": 0,
</span><span style="color:#a3be8c;"> "lap": 1,
</span><span style="color:#a3be8c;"> "alt_names": "Moscow Semi-longhair, HairSiberian Forest Cat",
</span><span style="color:#a3be8c;"> "adaptability": 5,
</span><span style="color:#a3be8c;"> "affection_level": 5,
</span><span style="color:#a3be8c;"> "child_friendly": 4,
</span><span style="color:#a3be8c;"> "dog_friendly": 5,
</span><span style="color:#a3be8c;"> "energy_level": 5,
</span><span style="color:#a3be8c;"> "grooming": 2,
</span><span style="color:#a3be8c;"> "health_issues": 2,
</span><span style="color:#a3be8c;"> "intelligence": 5,
</span><span style="color:#a3be8c;"> "shedding_level": 3,
</span><span style="color:#a3be8c;"> "social_needs": 4,
</span><span style="color:#a3be8c;"> "stranger_friendly": 3,
</span><span style="color:#a3be8c;"> "vocalisation": 1,
</span><span style="color:#a3be8c;"> "experimental": 0,
</span><span style="color:#a3be8c;"> "hairless": 0,
</span><span style="color:#a3be8c;"> "natural": 1,
</span><span style="color:#a3be8c;"> "rare": 0,
</span><span style="color:#a3be8c;"> "rex": 0,
</span><span style="color:#a3be8c;"> "suppressed_tail": 0,
</span><span style="color:#a3be8c;"> "short_legs": 0,
</span><span style="color:#a3be8c;"> "wikipedia_url": "https://en.wikipedia.org/wiki/Siberian_(cat)",
</span><span style="color:#a3be8c;"> "hypoallergenic": 1,
</span><span style="color:#a3be8c;"> "reference_image_id": "3bkZAjRh1"
</span><span style="color:#a3be8c;"> }
</span><span style="color:#a3be8c;"> ]
</span><span style="color:#a3be8c;"> </span><span>"#;
</span><span>
</span><span> </span><span style="color:#b48ead;">let</span><span> resp: BreedResponse = serde_json::from_str(data)?;
</span><span>
</span><span> Ok(resp)
</span><span> }
</span><span>}
</span></code></pre>
<p>You can also shorten this up by doing <code>BreedResponse { id: "sibe", name: "Siberian" }</code> but for real world examples I find it easier to paste the JSON string.</p>
<p>Now we can wire up some tests. In this simple case, we are only testing that we implemented the <code>BreedResponse</code> type and <code>Breed</code> struct correctly.</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span>#[</span><span style="color:#bf616a;">test</span><span>]
</span><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">search_breeds</span><span>() {
</span><span> </span><span style="color:#b48ead;">struct </span><span>TheCatApiClientMock {}
</span><span> </span><span style="color:#b48ead;">impl </span><span>TheCatApi </span><span style="color:#b48ead;">for </span><span>TheCatApiClientMock {
</span><span> </span><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">search_breeds</span><span>(
</span><span> &</span><span style="color:#bf616a;">self</span><span>,
</span><span> </span><span style="color:#bf616a;">_query</span><span>: &</span><span style="color:#b48ead;">str</span><span>,
</span><span> ) -> Result<BreedResponse, Box<dyn std::error::Error>> {
</span><span> </span><span style="color:#65737e;">// removed for brevity. use implementation above
</span><span> }
</span><span> }
</span><span>
</span><span> </span><span style="color:#b48ead;">let</span><span> client = TheCatApiClientMock {};
</span><span> </span><span style="color:#b48ead;">let</span><span> resp = client
</span><span> .</span><span style="color:#96b5b4;">search_breeds</span><span>("</span><span style="color:#a3be8c;">does not matter what i put</span><span>")
</span><span> .</span><span style="color:#96b5b4;">expect</span><span>("</span><span style="color:#a3be8c;">search_breeds failed</span><span>");
</span><span> assert_eq!(resp[</span><span style="color:#d08770;">0</span><span>].id, "</span><span style="color:#a3be8c;">sibe</span><span>");
</span><span> assert_eq!(resp[</span><span style="color:#d08770;">0</span><span>].name, "</span><span style="color:#a3be8c;">Siberian</span><span>");
</span><span>}
</span></code></pre>
<pre style="background-color:#2b303b;color:#c0c5ce;"><code><span>$ cargo test
</span><span> Finished test [unoptimized + debuginfo] target(s) in 0.15s
</span><span> Running unittests src/main.rs (target/debug/deps/rust_mocks-aa82d6388d1da1bd)
</span><span>
</span><span>running 1 test
</span><span>test tests::search_breeds ... ok
</span><span>
</span><span>test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s
</span></code></pre>
<p>And we should also test a decoding failure to make sure we are testing what we expect.</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span>#[</span><span style="color:#bf616a;">test</span><span>]
</span><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">search_breeds_decode_error</span><span>() {
</span><span> </span><span style="color:#b48ead;">struct </span><span>TheCatApiClientMock {}
</span><span> </span><span style="color:#b48ead;">impl </span><span>TheCatApi </span><span style="color:#b48ead;">for </span><span>TheCatApiClientMock {
</span><span> </span><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">search_breeds</span><span>(
</span><span> &</span><span style="color:#bf616a;">self</span><span>,
</span><span> </span><span style="color:#bf616a;">_query</span><span>: &</span><span style="color:#b48ead;">str</span><span>,
</span><span> ) -> Result<BreedResponse, Box<dyn std::error::Error>> {
</span><span> </span><span style="color:#b48ead;">let</span><span> data = "</span><span style="color:#a3be8c;">nope</span><span>";
</span><span>
</span><span> </span><span style="color:#b48ead;">let</span><span> resp: BreedResponse = serde_json::from_str(data)?;
</span><span>
</span><span> Ok(resp)
</span><span> }
</span><span> }
</span><span>
</span><span> </span><span style="color:#b48ead;">let</span><span> client = TheCatApiClientMock {};
</span><span> </span><span style="color:#b48ead;">let</span><span> err = client
</span><span> .</span><span style="color:#96b5b4;">search_breeds</span><span>("</span><span style="color:#a3be8c;">does not matter what i put</span><span>")
</span><span> .</span><span style="color:#96b5b4;">unwrap_err</span><span>();
</span><span> assert_eq!(err.</span><span style="color:#96b5b4;">to_string</span><span>(), "</span><span style="color:#a3be8c;">expected ident at line 2 column 18</span><span>");
</span><span>}
</span></code></pre>
<p>Notice that I put the mock client implementation inside the test function. If we define it outside the test function, then each mock needs a unique name.</p>
<h2 id="injecting-traits-as-dependencies"><a class="zola-anchor" href="#injecting-traits-as-dependencies" aria-label="Anchor link for: injecting-traits-as-dependencies">Injecting Traits As Dependencies</a></h2>
<p>This is all good but we have side stepped a big part of implementing this strategy in a real program. A big reason people reach for testing libraries like <a href="https://www.npmjs.com/package/fetch-mock">fetch-mock</a> is because they have no other way to tell their function to use a different implementation of fetch. Indeed, we need to structure our program to inject dependencies. I mostly write web servers, so let us create a simple web server for a more real world example. Our web server will accept a request to search for breeds and then use our implementation of <code>TheCatApi</code> trait get the data. I am going to use <a href="https://rocket.rs">rocket.rs</a> as it has really good docs. I will also be using the v0.4 version, which is blocking. Be aware that the v0.4 version requires the nightly compiler.</p>
<p>When we create our web server, we want to inject our dependencies. Specifically, we want to inject <code>TheCatApiClient</code>. We can do this in rocket.rs by using the <a href="https://api.rocket.rs/v0.4/rocket/struct.Rocket.html#method.manage">manage</a> method. This will allow us to access the client from the request handler.</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">main</span><span>() {
</span><span> </span><span style="color:#b48ead;">let</span><span> the_cat_api_client = TheCatApiClient {};
</span><span>
</span><span> rocket::ignite()
</span><span> </span><span style="color:#65737e;">// we inject our dependency here
</span><span> .</span><span style="color:#96b5b4;">manage</span><span>(the_cat_api_client)
</span><span> .</span><span style="color:#96b5b4;">mount</span><span>("</span><span style="color:#a3be8c;">/</span><span>", routes![index, get_breed])
</span><span> .</span><span style="color:#96b5b4;">launch</span><span>();
</span><span>}
</span><span>
</span><span>#[</span><span style="color:#bf616a;">get</span><span>("</span><span style="color:#a3be8c;">/breed?<search></span><span>")]
</span><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">get_breed</span><span>(
</span><span> </span><span style="color:#65737e;">// we access our dependency here
</span><span> </span><span style="color:#bf616a;">client</span><span>: State<TheCatApiClient>,
</span><span> </span><span style="color:#bf616a;">search</span><span>: &RawStr,
</span><span>) -> Result<String, Box<dyn std::error::Error>> {
</span><span> </span><span style="color:#b48ead;">let</span><span> resp = client.</span><span style="color:#96b5b4;">inner</span><span>().</span><span style="color:#96b5b4;">search_breeds</span><span>(search)?;
</span><span>
</span><span> Ok(resp[</span><span style="color:#d08770;">0</span><span>].name.</span><span style="color:#96b5b4;">clone</span><span>())
</span><span>}
</span></code></pre>
<p>Now we can run our server</p>
<pre style="background-color:#2b303b;color:#c0c5ce;"><code><span>cargo run
</span><span> Compiling rust-mocks v0.1.0 (/Users/herman/Code/rust-mocks)
</span><span> Finished dev [unoptimized + debuginfo] target(s) in 2.10s
</span><span> Running `target/debug/rust-mocks`
</span><span>🔧 Configured for development.
</span><span> => address: localhost
</span><span> => port: 8000
</span><span> => log: normal
</span><span> => workers: 16
</span><span> => secret key: generated
</span><span> => limits: forms = 32KiB
</span><span> => keep-alive: 5s
</span><span> => read timeout: 5s
</span><span> => write timeout: 5s
</span><span> => tls: disabled
</span><span>🛰 Mounting /:
</span><span> => GET /breed?<search> (get_breed)
</span><span>🚀 Rocket has launched from http://localhost:8000
</span></code></pre>
<p>and make a request</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span>$ curl 'localhost:</span><span style="color:#d08770;">8000</span><span>/breed?search=cal'
</span><span>California Spangled
</span></code></pre>
<p>Now that we have a working web server that makes requests to The Cat API, we want to write a test for our <code>get_breed</code> handler. The rocket.rs framework makes this fairly straight-forward as <code>client</code> is a parameter to <code>get_breeds</code>. We will need to change the type of client from the concrete implementation of <code>TheCatApiClient</code> to a type that will allow us to use any implementation of the <code>TheCatApi</code> trait. There are two ways to do this: generics and boxed traits. Unfortunately, rocket.rs does not allow us to use generic functions. If we try to write a generic function, then we will get a compiler error.</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span>#[</span><span style="color:#bf616a;">get</span><span>("</span><span style="color:#a3be8c;">/breed?<search></span><span>")]
</span><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">get_breed</span><span><T: TheCataApi>(
</span><span> </span><span style="color:#bf616a;">client</span><span>: State<T>, </span><span style="color:#65737e;">// <---- compiler error here
</span><span> </span><span style="color:#bf616a;">search</span><span>: &RawStr,
</span><span>) -> Result<String, Box<dyn std::error::Error>> {}
</span></code></pre>
<p>So, boxed traits it is! We only create one instance of our API client when our program runs, so it creating our client on the stack or heap does not make much of a difference.</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span>#[</span><span style="color:#bf616a;">get</span><span>("</span><span style="color:#a3be8c;">/breed?<search></span><span>")]
</span><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">get_breed</span><span>(
</span><span> </span><span style="color:#bf616a;">client</span><span>: State<Box<dyn TheCatApi>>,
</span><span> </span><span style="color:#bf616a;">search</span><span>: &RawStr,
</span><span>) -> Result<String, Box<dyn std::error::Error>> {
</span><span> </span><span style="color:#b48ead;">let</span><span> resp = client.</span><span style="color:#96b5b4;">inner</span><span>().</span><span style="color:#96b5b4;">search_breeds</span><span>(search)?;
</span><span>
</span><span> Ok(resp[</span><span style="color:#d08770;">0</span><span>].name.</span><span style="color:#96b5b4;">clone</span><span>())
</span><span>}
</span></code></pre>
<p>We need to update our main function as well.</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">main</span><span>() {
</span><span> </span><span style="color:#b48ead;">let</span><span> the_cat_api_client = Box::new(TheCatApiClient {});
</span><span>
</span><span> rocket::ignite()
</span><span> .</span><span style="color:#96b5b4;">manage</span><span>(the_cat_api_client)
</span><span> .</span><span style="color:#96b5b4;">mount</span><span>("</span><span style="color:#a3be8c;">/</span><span>", routes![index, get_breed])
</span><span> .</span><span style="color:#96b5b4;">launch</span><span>();
</span><span>}
</span></code></pre>
<p>Now our <code>get_breed</code> function can accept any implementation of <code>TheCatApi</code> trait. We have one more thing to do before we can write our test. Notice that the <code>client</code> type in <code>get_breed</code> is <code>State<Box<dyn TheCatApi>></code>. We need some way of creating that <code>State</code> type. The rocket.rs docs have a <a href="https://api.rocket.rs/v0.4/rocket/request/struct.State.html#testing-with-state">Testing with <code>State</code></a> section that gives us the hint. So, to make this work we will need to extract the set up logic for our web server out of the <code>main</code> function. We create a <code>setup</code> function that allows us to pass in an implementation of <code>TheCatApi</code> trait.</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">setup</span><span>(</span><span style="color:#bf616a;">the_cat_api</span><span>: Box<dyn TheCatApi>) -> Rocket {
</span><span> rocket::ignite()
</span><span> .</span><span style="color:#96b5b4;">manage</span><span>(the_cat_api)
</span><span> .</span><span style="color:#96b5b4;">mount</span><span>("</span><span style="color:#a3be8c;">/</span><span>", routes![index, get_breed])
</span><span>}
</span><span>
</span><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">main</span><span>() {
</span><span> </span><span style="color:#b48ead;">let</span><span> the_cat_api_client = Box::new(TheCatApiClient {});
</span><span>
</span><span> </span><span style="color:#b48ead;">let</span><span> rocket = </span><span style="color:#96b5b4;">setup</span><span>(the_cat_api_client);
</span><span> rocket.</span><span style="color:#96b5b4;">launch</span><span>();
</span><span>}
</span></code></pre>
<p>With that in place, we can write a test that allows us to ensure <code>get_breed</code> succeeds.</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span>#[</span><span style="color:#bf616a;">test</span><span>]
</span><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">breed_succeeds</span><span>() {
</span><span> </span><span style="color:#b48ead;">struct </span><span>TheCatApiClientMock {}
</span><span> </span><span style="color:#b48ead;">impl </span><span>TheCatApi </span><span style="color:#b48ead;">for </span><span>TheCatApiClientMock {
</span><span> </span><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">search_breeds</span><span>(
</span><span> &</span><span style="color:#bf616a;">self</span><span>,
</span><span> </span><span style="color:#bf616a;">query</span><span>: &</span><span style="color:#b48ead;">str</span><span>,
</span><span> ) -> Result<BreedResponse, Box<dyn std::error::Error>> {
</span><span>
</span><span> </span><span style="color:#65737e;">// removed for brevity. use implementation above
</span><span> }
</span><span> }
</span><span>
</span><span> </span><span style="color:#65737e;">// create our mock client
</span><span> </span><span style="color:#b48ead;">let</span><span> mock_client = Box::new(TheCatApiClientMock {});
</span><span>
</span><span> </span><span style="color:#65737e;">// inject it into our web server
</span><span> </span><span style="color:#b48ead;">let</span><span> rocket = </span><span style="color:#96b5b4;">setup</span><span>(mock_client);
</span><span>
</span><span> </span><span style="color:#65737e;">// get our state
</span><span> </span><span style="color:#b48ead;">let</span><span> state: State<Box<dyn TheCatApi>> =
</span><span> State::from(&rocket).</span><span style="color:#96b5b4;">expect</span><span>("</span><span style="color:#a3be8c;">managing `TheCatApiClientMock`</span><span>");
</span><span>
</span><span> </span><span style="color:#b48ead;">let</span><span> resp = </span><span style="color:#96b5b4;">get_breed</span><span>(state, RawStr::from_str("</span><span style="color:#a3be8c;">sib</span><span>")).</span><span style="color:#96b5b4;">expect</span><span>("</span><span style="color:#a3be8c;">get_breed failed</span><span>");
</span><span> assert_eq!(resp, "</span><span style="color:#a3be8c;">Siberian</span><span>");
</span><span>}
</span></code></pre>
<p>We can also create a mock client that fails.</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span>#[</span><span style="color:#bf616a;">test</span><span>]
</span><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">breed_decode_error</span><span>() {
</span><span> </span><span style="color:#b48ead;">struct </span><span>TheCatApiClientMock {}
</span><span> </span><span style="color:#b48ead;">impl </span><span>TheCatApi </span><span style="color:#b48ead;">for </span><span>TheCatApiClientMock {
</span><span> </span><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">search_breeds</span><span>(
</span><span> &</span><span style="color:#bf616a;">self</span><span>,
</span><span> </span><span style="color:#bf616a;">_query</span><span>: &</span><span style="color:#b48ead;">str</span><span>,
</span><span> ) -> Result<BreedResponse, Box<dyn std::error::Error>> {
</span><span> </span><span style="color:#b48ead;">let</span><span> data = "</span><span style="color:#a3be8c;">nope</span><span>";
</span><span>
</span><span> </span><span style="color:#b48ead;">let</span><span> resp: BreedResponse = serde_json::from_str(data)?;
</span><span>
</span><span> Ok(resp)
</span><span> }
</span><span> }
</span><span>
</span><span> </span><span style="color:#b48ead;">let</span><span> mock_client = Box::new(TheCatApiClientMock {});
</span><span> </span><span style="color:#b48ead;">let</span><span> rocket = </span><span style="color:#96b5b4;">setup</span><span>(mock_client);
</span><span> </span><span style="color:#b48ead;">let</span><span> state: State<Box<dyn TheCatApi>> =
</span><span> State::from(&rocket).</span><span style="color:#96b5b4;">expect</span><span>("</span><span style="color:#a3be8c;">managing `TheCatApiClientMock`</span><span>");
</span><span> </span><span style="color:#b48ead;">let</span><span> err = </span><span style="color:#96b5b4;">get_breed</span><span>(state, RawStr::from_str("</span><span style="color:#a3be8c;">sib</span><span>")).</span><span style="color:#96b5b4;">unwrap_err</span><span>();
</span><span> assert_eq!(err.</span><span style="color:#96b5b4;">to_string</span><span>(), "</span><span style="color:#a3be8c;">expected ident at line 1 column 2</span><span>");
</span><span>}
</span></code></pre>
<h2 id="further-discussion"><a class="zola-anchor" href="#further-discussion" aria-label="Anchor link for: further-discussion">Further Discussion</a></h2>
<p>Structuring our application this way makes it easier to inject dependencies. I tend to inject any dependency that is accessing the network or file system. This includes common parts of a web server like databases, loggers, tracing and API clients. The benefits go beyond testing.</p>
<p>When serverless computing came onto the scene, many developers would write their serverless function in such a way that it could only run in the cloud. This created painfully long dev cycles where each change would take 30 seconds to upload to the cloud to verify. Some folks reached for complex solutions like <a href="https://www.serverless.com/">serverles framework</a> that tried to emulate the behavior of the cloud. It may have been better to design the application to accept a <code>Server</code> trait. We could implement that trait to work for <a href="https://hyper.rs/">hyper.rs</a> and to work for the cloud implementation. Our dev cycle is now much faster and has less complexity than an emulated system.</p>
<p>This starts to look like <a href="https://en.wikipedia.org/wiki/Hexagonal_architecture_(software)">Hexagonal architecture</a>. We define traits for our network and filesystem dependencies so we can change the behavior of the system. Do not go overboard on this pattern. I tend to implement a trait when the need arises. I also like the <a href="https://www.destroyallsoftware.com/talks/boundaries">Boundaries</a> talk by Gary Bernhardt.</p>
<h2 id="criticism-and-alternatives"><a class="zola-anchor" href="#criticism-and-alternatives" aria-label="Anchor link for: criticism-and-alternatives">Criticism and Alternatives</a></h2>
<p>One downside to this approach is that we are not testing the actual implementation of <code>search_breeds</code>. We may have a bug in our code that does not show up in our tests. It is important that we keep our <code>search_breeds</code> function as small as possible to mitigate this downside.</p>
<p>If testing the actual implementation of <code>search_breeds</code> is a real concern, then we want to reach for libraries like <a href="https://github.com/ggriffiniii/httptest">httptest</a>. This will define a local web server that can be configured to return specific responses. If we have other types of dependencies, like a database, then we can look for an in-memory implementation of that database. There is no silver bullet solution, so pick the option that best suits your needs.</p>
http://activitystrea.ms/schema/1.0/postWASI example using Rust and Lucet2019-04-01T00:00:00+00:002019-04-01T00:00:00+00:00Herman J. Radtke IIIhttp://activitystrea.ms/schema/1.0/personhttps://hermanradtke.com/about/https://hermanradtke.com/2019/04/01/wasi-example-using-rust-and-lucet.html/<p>Lucet is Fastly's <a href="https://www.fastly.com/blog/announcing-lucet-fastly-native-webassembly-compiler-runtime">native WebAssembly compiler and runtime</a>. Using the Lucet runtime and Rust's <code>wasm32-unknown-wasi</code> target, we can create a WASM program that runs on the server.</p>
<p>Lucet is Fastly's <a href="https://www.fastly.com/blog/announcing-lucet-fastly-native-webassembly-compiler-runtime">native WebAssembly compiler and runtime</a>. Using the Lucet runtime and Rust's <code>wasm32-unknown-wasi</code> target, we can create a WASM program that runs on the server.</p>
<span id="continue-reading"></span>
<p>At the time this blog post was written, the <code>wasm32-unknown-wasi</code> target is only available on Rust nightly. Make sure you are using a version of nightly that is as recent as April 1, 2019.</p>
<pre data-lang="bash" style="background-color:#2b303b;color:#c0c5ce;" class="language-bash "><code class="language-bash" data-lang="bash"><span style="color:#bf616a;">$</span><span> rustup update
</span><span style="color:#bf616a;">info:</span><span> syncing channel updates for '</span><span style="color:#a3be8c;">stable-x86_64-apple-darwin</span><span>'
</span><span style="color:#bf616a;">info:</span><span> syncing channel updates for '</span><span style="color:#a3be8c;">nightly-x86_64-apple-darwin</span><span>'
</span><span style="color:#bf616a;">352.7</span><span> KiB / 352.7 KiB (100 %) </span><span style="color:#bf616a;">80.0</span><span> KiB/s ETA: 0 s
</span><span style="color:#bf616a;">info:</span><span> latest update on 2019-04-01, rust version 1.35.0-nightly (e3428db7c 2019-03-31)
</span></code></pre>
<p>Add the <code>wasm32-unknown-wasi</code> target using <code>rustup</code>:</p>
<pre data-lang="bash" style="background-color:#2b303b;color:#c0c5ce;" class="language-bash "><code class="language-bash" data-lang="bash"><span style="color:#bf616a;">$</span><span> rustup target add wasm32-unknown-wasi</span><span style="color:#bf616a;"> --toolchain</span><span> nightly
</span><span style="color:#bf616a;">info:</span><span> downloading component '</span><span style="color:#a3be8c;">rust-std</span><span>' for '</span><span style="color:#a3be8c;">wasm32-unknown-wasi</span><span>'
</span><span> </span><span style="color:#bf616a;">10.4</span><span> MiB / 10.4 MiB (100 %) </span><span style="color:#bf616a;">1.1</span><span> MiB/s ETA: 0 s
</span><span style="color:#bf616a;">info:</span><span> installing component '</span><span style="color:#a3be8c;">rust-std</span><span>' for '</span><span style="color:#a3be8c;">wasm32-unknown-wasi</span><span>'
</span></code></pre>
<p>Create a new binary, via Cargo and compile it to <code>wasm32-unknown-wasi</code>:</p>
<pre data-lang="bash" style="background-color:#2b303b;color:#c0c5ce;" class="language-bash "><code class="language-bash" data-lang="bash"><span style="color:#bf616a;">$</span><span> cargo init hello
</span><span> </span><span style="color:#bf616a;">Created</span><span> binary (application) </span><span style="color:#bf616a;">package
</span><span style="color:#bf616a;">$</span><span> cd hello/
</span><span style="color:#bf616a;">$</span><span> cargo +nightly build</span><span style="color:#bf616a;"> --target</span><span> wasm32-unknown-wasi
</span><span> </span><span style="color:#bf616a;">Compiling</span><span> hello v0.1.0 (/Users/herman/Code/hello)
</span><span> </span><span style="color:#bf616a;">Finished</span><span> dev </span><span style="color:#b48ead;">[</span><span>unoptimized + debuginfo</span><span style="color:#b48ead;">]</span><span> target(s) </span><span style="color:#bf616a;">in</span><span> 0.59s
</span></code></pre>
<p>We now have a <code>hello.wasm</code> file that supports <a href="https://hacks.mozilla.org/2019/03/standardizing-wasi-a-webassembly-system-interface/">WASI</a>. The <code>hello.wasm</code> file will not run on its own though. We will use Fastly's Lucet runtime to get our program running. I created a Docker container with Lucet already built at https://hub.docker.com/r/hjr3/lucet. I wrote a <a href="/2019/03/31/lucet-in-five-minutes.html">blog post</a> on this if you want more details. Use the <code>hjr3/lucet</code> container to build native <code>x86_64</code> code from our WASM file and then run it using the Lucet runtime:</p>
<pre data-lang="bash" style="background-color:#2b303b;color:#c0c5ce;" class="language-bash "><code class="language-bash" data-lang="bash"><span style="color:#bf616a;">$</span><span> docker run</span><span style="color:#bf616a;"> --rm -it -v </span><span>"$</span><span style="color:#a3be8c;">(</span><span style="color:#bf616a;">pwd</span><span style="color:#a3be8c;">)</span><span>":/usr/local/src hjr3/lucet lucetc-wasi</span><span style="color:#bf616a;"> -o</span><span> hello.so target/wasm32-unknown-wasi/debug/hello.wasm
</span><span style="color:#bf616a;">$</span><span> docker run</span><span style="color:#bf616a;"> --rm -it -v </span><span>"$</span><span style="color:#a3be8c;">(</span><span style="color:#bf616a;">pwd</span><span style="color:#a3be8c;">)</span><span>":/usr/local/src hjr3/lucet lucet-wasi hello.so
</span><span style="color:#bf616a;">Hello,</span><span> world!
</span></code></pre>
<p>One neat thing about this example is our local development operating system does not have to match our target runtime operating system. We can compile our Rust program locally on MacOS and only use the <code>hjr3/lucet</code> docker container (which runs Ubuntu Xenial) to convert/run the program.</p>
<p>WASI is brand new and a lot of development is still going on. From the <a href="https://github.com/rust-lang/rust/pull/59464">Rust PR</a> that added support for the <code>wasm32-unknown-wasi</code> target:</p>
<blockquote>
<p>The wasi target in libstd is still somewhat bare bones. This PR does not
fill out the filesystem, networking, threads, etc. Instead it only
provides the most basic of integration with the wasi syscalls...</p>
</blockquote>
<p>I plan on demonstrating more examples as libstd gets built out.</p>
http://activitystrea.ms/schema/1.0/postfastly/lucet in five minutes2019-03-31T00:00:00+00:002019-03-31T00:00:00+00:00Herman J. Radtke IIIhttp://activitystrea.ms/schema/1.0/personhttps://hermanradtke.com/about/https://hermanradtke.com/2019/03/31/lucet-in-five-minutes.html/<p>Lucet is Fastly's <a href="https://www.fastly.com/blog/announcing-lucet-fastly-native-webassembly-compiler-runtime">native WebAssembly compiler and runtime</a>. I am a big fan of Rust, Fastly and WASM. Especially WASM on the server via <a href="https://hacks.mozilla.org/2019/03/standardizing-wasi-a-webassembly-system-interface/">WASI</a>. I jumped right in and tried to get my own lucet program running, but the <a href="https://github.com/fastly/lucet/blob/8632b16faf2353727c9aa272d3ac65885eb9e1b9/README.md">setup</a> is a rather long process. My plan was to introduce lucet to some colleagues at my local Rust meetup. I am a huge fan of Rust, but the compile times are an issue. Spending 30 minutes on setup was a non-starter. I was excited when I saw that Fastly published a <a href="https://hub.docker.com/r/fastly/lucet">Docker container</a>:</p>
<p>Lucet is Fastly's <a href="https://www.fastly.com/blog/announcing-lucet-fastly-native-webassembly-compiler-runtime">native WebAssembly compiler and runtime</a>. I am a big fan of Rust, Fastly and WASM. Especially WASM on the server via <a href="https://hacks.mozilla.org/2019/03/standardizing-wasi-a-webassembly-system-interface/">WASI</a>. I jumped right in and tried to get my own lucet program running, but the <a href="https://github.com/fastly/lucet/blob/8632b16faf2353727c9aa272d3ac65885eb9e1b9/README.md">setup</a> is a rather long process. My plan was to introduce lucet to some colleagues at my local Rust meetup. I am a huge fan of Rust, but the compile times are an issue. Spending 30 minutes on setup was a non-starter. I was excited when I saw that Fastly published a <a href="https://hub.docker.com/r/fastly/lucet">Docker container</a>:</p>
<span id="continue-reading"></span><blockquote class="twitter-tweet" data-lang="en"><p lang="en" dir="ltr">The Fastly Lucet image is now available on the Docker Hub. `docker pull fastly/lucet` and you’re all set.</p>— Frank Denis (@jedisct1) <a href="https://twitter.com/jedisct1/status/1111330113864548353?ref_src=twsrc%5Etfw">March 28, 2019</a></blockquote>
<script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>
<p>However, this container was built for people developing on lucet. The container has all of the required dependencies, but still requires the same initial setup process. So, I decided to take advantage of Docker's multi-stage build process to create a container that has lucet already built. It comes in at the slim size of 107 MB, which should make it fast to download.</p>
<p>Here is how you can get lucet running a simple Hello World program in 5 minutes.</p>
<pre data-lang="bash" style="background-color:#2b303b;color:#c0c5ce;" class="language-bash "><code class="language-bash" data-lang="bash"><span style="color:#bf616a;">$</span><span> docker pull hjr3/lucet
</span><span style="color:#bf616a;">$</span><span> cat hello.c
</span><span style="color:#65737e;">#include <stdio.h>
</span><span>
</span><span style="color:#bf616a;">int</span><span> main(void)
</span><span>{
</span><span> </span><span style="color:#bf616a;">puts</span><span>("</span><span style="color:#a3be8c;">Hello world</span><span>");
</span><span> </span><span style="color:#b48ead;">return</span><span> 0;
</span><span>}
</span><span style="color:#bf616a;">$</span><span> docker run</span><span style="color:#bf616a;"> --rm -it -v </span><span>"$</span><span style="color:#a3be8c;">(</span><span style="color:#bf616a;">pwd</span><span style="color:#a3be8c;">)</span><span>":/usr/local/src hjr3/lucet wasm32-unknown-wasi-clang</span><span style="color:#bf616a;"> -Ofast -o</span><span> hello.wasm hello.c
</span><span style="color:#bf616a;">$</span><span> docker run</span><span style="color:#bf616a;"> --rm -it -v </span><span>"$</span><span style="color:#a3be8c;">(</span><span style="color:#bf616a;">pwd</span><span style="color:#a3be8c;">)</span><span>":/usr/local/src hjr3/lucet lucetc-wasi</span><span style="color:#bf616a;"> -o</span><span> hello.so hello.wasm
</span><span style="color:#bf616a;">$</span><span> docker run</span><span style="color:#bf616a;"> --rm -it -v </span><span>"$</span><span style="color:#a3be8c;">(</span><span style="color:#bf616a;">pwd</span><span style="color:#a3be8c;">)</span><span>":/usr/local/src hjr3/lucet lucet-wasi hello.so
</span><span style="color:#bf616a;">Hello</span><span> world
</span></code></pre>
<p>This docker container provides a version of clang capable of compiling to wasm via the <code>wasm32-unknown-wasi-clang</code> command. It is a not a requirement to compile your program to wasi using <code>wasm32-unknown-wasi-clang</code> in this docker container. The only requirement is that you compile the program to wasi before using <code>lucetc-wasi</code>. Also, take note that <code>lucetc-wasi</code> and <code>lucet-wasi</code> have very similar spellings, but are indeed two different programs.</p>
<p><del>If you are wondering why I did not demo converting a Rust program to WASI, we are blocked until <a href="https://github.com/rust-lang/rust/pull/59464">wasm32-unknown-wasi</a> is a valid target in rustup. As soon as that target is available, then I plan on creating another post showing how to get Rust + lucet working together.</del> See <a href="/2019/04/01/wasi-example-using-rust-and-lucet.html">WASI example using Rust and Lucet</a> for a Rust example that runs on lucet.</p>
<p>Lucet is not 1.0 yet and I expect to be changing it a lot. As of this moment, the <a href="https://hub.docker.com/r/hjr3/lucet">hjr3/lucet</a> container is built against <a href="https://github.com/fastly/lucet/commit/e6b399b3fc6794f8f78a8bf6ad404ca640a090c4">fastly/lucet commit e6b399b</a>. As new changes come in, I will do my best to update the container. I may setup an automated process if this proves useful to people. I will tag each version against the fastly/lucet commit the container is built against.</p>
http://activitystrea.ms/schema/1.0/postRemoving Connection State In mob2018-03-29T00:00:00+00:002018-03-29T00:00:00+00:00Herman J. Radtke IIIhttp://activitystrea.ms/schema/1.0/personhttps://hermanradtke.com/about/https://hermanradtke.com/2018/03/29/removing-connection-state-from-mob.html/<p>I started writing mob, an multi-echo server using mio, in 2015. I coded mob into a mostly working state and then left it mostly alone, only updating it to work with the latest stable mio. Recently, I started looking at the code again and had the urge to improve it. In a previous <a href="/2015/10/23/managing-connection-state-with-mio-rust.html">post</a>, I talked about managing the state of connections in mob. In this post, I will walk through what I did to remove the need to track connection state. I wanted to remove the state because the implementation required an <code>O(n)</code> operation every <em>tick</em> of the mio event loop. It also added a fair amount of complexity to the code.</p>
<p>Before discussing the solution, I want to review the problem I was trying to solve. With asynchronous IO, the state of the connection and the events may get out of sync. I kept running into problems where I was processing events and would discover the connection was no longer present in the slab. This would cause mob to panic instead of resetting that connection and moving on. Here is one example of how this might happen:</p>
<ul>
<li><strong>mio</strong>: blocks on poll</li>
<li><strong>client</strong>: client sends some data</li>
<li><strong>mio</strong>: receives read event for connection A</li>
<li><strong>mio</strong>: receives write event for connection A</li>
<li><strong>mio</strong>: unblocks and returns 2 events</li>
<li><strong>mob</strong>: read event is processed, there is an error and connection A is removed from slab</li>
<li><strong>mob</strong>: write event is processed, connection A cannot be found and results in a panic</li>
</ul>
<p>I needed to not panic when <code>connection A</code> was not present in the slab. You can read that previous <a href="/2015/10/23/managing-connection-state-with-mio-rust.html">post</a> for the details on the scheme I concocted to work around this issue. Looking at it now, it is clear to me that I did not fully understand Rust's ownership model and was partially working around that. I was also not clear on how mio (epoll/kqueue) were sending events.</p>
<h2 id="ownership-problems"><a class="zola-anchor" href="#ownership-problems" aria-label="Anchor link for: ownership-problems">Ownership Problems</a></h2>
<p>I have a function that, given a token, would find the corresponding connection in the connection slab. It looks like this (and used to be named <code>find_connection_by_token</code>):</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">connection</span><span>(&</span><span style="color:#b48ead;">mut </span><span style="color:#bf616a;">self</span><span>, </span><span style="color:#bf616a;">token</span><span>: Token) -> &</span><span style="color:#b48ead;">mut</span><span> Connection {
</span><span> &</span><span style="color:#b48ead;">mut </span><span style="color:#bf616a;">self</span><span>.conns[token]
</span><span>}
</span></code></pre>
<p>This function takes <code>&mut self</code> because it needs to return a mutable reference to the <code>Connection</code>. When I first started writing mob, I did not yet have a good mental model on how to write Rust programs. I fought the borrow checker constantly because I would try to assign the connection to a variable, <code>let conn = self.connection(token);</code>, only to have the compiler tell me this mutable reference was preventing me from using the <code>connection</code> function again later on in the code. It is now clear to me that I should have structured my code to keep all the connection logic in one place and not try to call <code>self.connection(token)</code> from different functions. I was used to working in garbage collected (GC) languages and C, which have no problems if you have multiple mutable references to objects. I also did not have a clear enough mental model of how mio was working in order to design the code to keep the connection logic in one place. In <code>mio v0.4.x</code>, you had to implement the <code>Handler</code> trait which forced a certain kind of design on the <a href="https://github.com/carllerche/mio/blob/v0.4.1/test/test_echo_server.rs#L238">code</a>.</p>
<p>I did not want to rewrite large parts of mob to remove the connection state though. To make some progress in the short-term, I made sure my code never kept a reference to a connection object. To do this, I made sure to always chain calls when using the connection object. Something like this:</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#bf616a;">self</span><span>.</span><span style="color:#96b5b4;">connection</span><span>(token).</span><span style="color:#96b5b4;">reregister</span><span>(poll)?;
</span></code></pre>
<p>This allows me to call <code>self.connection()</code> all over my code without hold on to that reference and causing ownership problems. I still think it is a good idea to refactor the code to separate the details of mio and the domain logic of mob, but that is for another time.</p>
<h2 id="how-events-are-triggered"><a class="zola-anchor" href="#how-events-are-triggered" aria-label="Anchor link for: how-events-are-triggered">How Events Are Triggered</a></h2>
<p>It is helpful to understand the difference between <em>level-triggered</em> and <em>edge-triggered</em> events before reading the below explanations. Mob receives edge-triggered events. If you are not clear what <em>level-triggered</em> vs <em>edge-triggered</em> means, I suggest you read the section in <a href="https://docs.rs/mio/0.6.14/mio/struct.Poll.html#edge-triggered-and-level-triggered">mio::Poll</a> that discusses helps define these two terms. If you want even more detail, I suggest reading <a href="https://idea.popcount.org/2017-02-20-epoll-is-fundamentally-broken-12/">Epoll is fundamentally broken 1/2</a> and <a href="https://idea.popcount.org/2017-03-20-epoll-is-fundamentally-broken-22/">Epoll is fundamentally broken 2/2</a>. Despite the sensational titles, the author of these blog posts goes through multiple examples of different types of triggering. All three of these links focus on read events. I was fuzzy about how a <em>write</em> event is triggered. I originally thought that it had something to do with the client making a call to read. My current understanding is that the events are triggered based on the changing state of the read and write kernel buffers for that connection (or socket).</p>
<p>The kernel read buffer starts out empty. At some point the kernel receives some data from the client, the kernel writes that data to the read buffer and a read event is triggered. The read buffer is now in a state of non-empty. If the kernel receives more data and writes it to the read buffer while it is in a state of non-empty, another event will not be triggered. Another read event will be triggered if, and only if, the kernel read buffer is in a state of empty and then data is written to it. If mob does not read all of the data, then the subsequent call to <code>poll</code> will appear to hang. What is happening is that <code>poll</code> will not receive another read event because the kernel read buffer is still in a non-empty state. This is why it is critically important than whenever mob receives a read event that it reads until it receives <code>WouldBlock</code>. This ensures the kernel read buffer is put back into a empty state and thus able to trigger another read event if it receives more data.</p>
<p>Write events are a little different because we usually do not have enough data to fill up the kernel write buffer until we receive <code>WouldBlock</code>. The kernel write buffer starts out in an empty state. When a connection registers for write, it will receive a write event due to the empty state of the buffer. The connection can then write some data, but in the mob case will most certainly not fill up the buffer as mob messages are quite small. The kernel write buffer is in a non-empty state and will not trigger another write event until the write buffer is empty. The kernel will then try to send the data to the client and once all the data is sent the kernel will trigger another write event (assuming the connection is still registered to receive write events). During the time between the initial write and the kernel sending the contents of the buffer, the connection is still allowed to write until it receives <code>WouldBlock</code>.</p>
<p>To round out my understanding, let us also briefly talk about the hangup (hup) event. The hup event works like read and write events. The connection is in an <em>established</em> state. When the client closes their end of the connection, the state changes to closed (or reset) and the connection will receive the hup event.</p>
<h2 id="solution"><a class="zola-anchor" href="#solution" aria-label="Anchor link for: solution">Solution</a></h2>
<p>With my improved understanding of ownership and a more accurate mental model of how the kernel sends events, the fix is pretty simple. Before processing any events, make sure the connection is in the slab. The diff of the change is <a href="https://github.com/hjr3/mob/pull/23/commits/485487217ddde7d316d7c7b0ac9057696278bc43#diff-4ce93534efc34e923ce01e975eb7ed80R105">here</a>. Most of the changes are removing code, so let me walk through the important parts.</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">if </span><span style="color:#bf616a;">self</span><span>.token != token && </span><span style="color:#bf616a;">self</span><span>.conns.</span><span style="color:#96b5b4;">contains</span><span>(token) == </span><span style="color:#d08770;">false </span><span>{
</span><span> debug!("</span><span style="color:#a3be8c;">Failed to find connection for {:?}</span><span>", token);
</span><span> </span><span style="color:#b48ead;">return</span><span>;
</span><span>}
</span></code></pre>
<p>The <code>self.token</code> is the token for the server and is not present in the slab. The slab is indexed by the tokens, so <code>self.conns.contains()</code> is a constant time lookup. Much better than iterating through a list of connections. I am not quite done though. If the connection encounters some error, I need to remove it from the slab. To do this I replaced <code>self.find_connection_by_token(token).mark_reset();</code> with <code>self.remove_token(token);</code>.</p>
<p>In hindsight, this was a pretty obvious change to make. Some <a href="https://www.reddit.com/r/rust/comments/3q0hjt/managing_connection_state_with_mio_herman_j/cwb7n3r/">comments</a> in the <a href="https://www.reddit.com/r/rust/comments/3q0hjt/managing_connection_state_with_mio_herman_j/">reddit post</a> on managing the connection state were trying to explain this to me, but I did not get it at the time. There are a few other things in mob that this same pattern of an original naive solution where I now see an obvious improvement to make. I hope to make those changes as well to continue to <em>diff</em> my mindset between 2015 and now.</p>
http://activitystrea.ms/schema/1.0/postFuture Based mpsc Queue Example with Tokio2017-03-03T00:00:00+00:002017-03-03T00:00:00+00:00Herman J. Radtke IIIhttp://activitystrea.ms/schema/1.0/personhttps://hermanradtke.com/about/https://hermanradtke.com/2017/03/03/future-mpsc-queue-with-tokio.html/<p>I was looking to use the <a href="https://github.com/alexcrichton/futures-rs/blob/0.1.10/src/sync/mpsc/mod.rs">mspc queue</a> that comes in the future crate in <a href="https://github.com/hjr3/weldr">weldr</a>. Weldr uses hyper (which uses tokio), so it makes sense to use tokio's Core as the executor. I did not have a good understanding of how this futures based mpsc queue worked. It has some subtle differences from the mpsc queue in the std library. I spent some time reading the documentation on https://tokio.rs/, a lot of source code and finally ended up writing a small example program. I have written a decent amount of inline comments with my understanding of how this all works.</p>
<p>A complete working example can be found <a href="https://github.com/hjr3/future-mpsc-example">here</a>. I wrote this using Rust version <code>1.15.1 (021bd294c 2017-02-08)</code>. For crate version, please check the <a href="https://github.com/hjr3/future-mpsc-example">Cargo.toml</a> in the repository.</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span>#[</span><span style="color:#bf616a;">macro_use</span><span>] </span><span style="color:#b48ead;">extern crate</span><span> log;
</span><span style="color:#b48ead;">extern crate</span><span> env_logger;
</span><span style="color:#b48ead;">extern crate</span><span> futures;
</span><span style="color:#b48ead;">extern crate</span><span> tokio_core;
</span><span>
</span><span style="color:#b48ead;">use </span><span>std::{thread, time};
</span><span>
</span><span style="color:#b48ead;">use </span><span>futures::{Stream, Sink, Future};
</span><span style="color:#b48ead;">use </span><span>futures::sync::mpsc;
</span><span>
</span><span style="color:#b48ead;">use </span><span>tokio_core::reactor::Core;
</span><span>
</span><span>#[</span><span style="color:#bf616a;">derive</span><span>(Debug)]
</span><span style="color:#b48ead;">struct </span><span>Stats {
</span><span> </span><span style="color:#b48ead;">pub </span><span style="color:#bf616a;">success</span><span>: </span><span style="color:#b48ead;">usize</span><span>,
</span><span> </span><span style="color:#b48ead;">pub </span><span style="color:#bf616a;">failure</span><span>: </span><span style="color:#b48ead;">usize</span><span>,
</span><span>}
</span><span>
</span><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">main</span><span>() {
</span><span>
</span><span> env_logger::init().</span><span style="color:#96b5b4;">expect</span><span>("</span><span style="color:#a3be8c;">Failed to initialize logger</span><span>");
</span><span>
</span><span> </span><span style="color:#65737e;">// tokio Core is an event loop executor. An executor is what runs a future to
</span><span> </span><span style="color:#65737e;">// completion.
</span><span> </span><span style="color:#b48ead;">let mut</span><span> core = Core::new().</span><span style="color:#96b5b4;">expect</span><span>("</span><span style="color:#a3be8c;">Failed to create core</span><span>");
</span><span>
</span><span> </span><span style="color:#65737e;">// `core.remote()` is a thread safe version of `core.handle()`. Both `core.remote()`
</span><span> </span><span style="color:#65737e;">// and `core.handle()` are used to spawn a future. When a future is _spawned_,
</span><span> </span><span style="color:#65737e;">// it basically means that it is being executed.
</span><span> </span><span style="color:#b48ead;">let</span><span> remote = core.</span><span style="color:#96b5b4;">remote</span><span>();
</span><span>
</span><span> </span><span style="color:#65737e;">// Now we create a multi-producer, single-consumer channel. This channel is very
</span><span> </span><span style="color:#65737e;">// similar to the mpsc channel in the std library. One big difference with this
</span><span> </span><span style="color:#65737e;">// channel is that `tx` and `rx` return futures. In order to have `tx` or `rx`
</span><span> </span><span style="color:#65737e;">// actually do any work, they have to be _executed_ by Core.
</span><span> </span><span style="color:#65737e;">//
</span><span> </span><span style="color:#65737e;">// The parameter passed to `mpsc::channel()` determines how large the queue is
</span><span> </span><span style="color:#65737e;">// _per tx_. Since we are cloning `tx` per iteration of the loop, we are guranteed
</span><span> </span><span style="color:#65737e;">// 1 spot for each loop iteration. Cloning tx is how we get multiple producers.
</span><span> </span><span style="color:#65737e;">//
</span><span> </span><span style="color:#65737e;">// For more detail on mpsc, see https://tokio.rs/docs/going-deeper/synchronization/
</span><span> </span><span style="color:#65737e;">//
</span><span> </span><span style="color:#65737e;">// Quick note:
</span><span> </span><span style="color:#65737e;">// - `tx` is of type `Sink`. A sink is something that you can place a value into
</span><span> </span><span style="color:#65737e;">// and then _flush_ the value into the queue.
</span><span> </span><span style="color:#65737e;">// - `rx` is of type `Stream`. A stream is an iterator of _future_ values.
</span><span> </span><span style="color:#65737e;">// More details on `tx` and `rx` below. For even more detail, see
</span><span> </span><span style="color:#65737e;">// https://tokio.rs/docs/getting-started/streams-and-sinks/
</span><span> </span><span style="color:#b48ead;">let </span><span>(tx, rx) = mpsc::channel(</span><span style="color:#d08770;">1</span><span>);
</span><span>
</span><span> </span><span style="color:#65737e;">// Create a thread that performs some work.
</span><span> thread::spawn(</span><span style="color:#b48ead;">move </span><span>|| {
</span><span> </span><span style="color:#b48ead;">loop </span><span>{
</span><span> </span><span style="color:#b48ead;">let</span><span> tx = tx.</span><span style="color:#96b5b4;">clone</span><span>();
</span><span>
</span><span> </span><span style="color:#65737e;">// INSERT WORK HERE - the work should be modeled as having a _future_ result.
</span><span> </span><span style="color:#b48ead;">let</span><span> delay = time::Duration::from_secs(</span><span style="color:#d08770;">1</span><span>);
</span><span> thread::sleep(delay);
</span><span>
</span><span> </span><span style="color:#65737e;">// In this fake example, we do not care about the values of the `Ok` and `Err`
</span><span> </span><span style="color:#65737e;">// variants. thus, we can use `()` for both.
</span><span> </span><span style="color:#65737e;">// Note: `::futures::done()` will be called ::futures::result() in later
</span><span> </span><span style="color:#65737e;">// versions of the future crate.
</span><span> </span><span style="color:#b48ead;">let</span><span> f = ::futures::done::<(), ()>(Ok(()));
</span><span> </span><span style="color:#65737e;">// END WORK
</span><span>
</span><span> </span><span style="color:#65737e;">// `remote.spawn` accepts a closure with a single parameter of type `&Handle`.
</span><span> </span><span style="color:#65737e;">// In this example, the `&Handle` is not needed. The future returned from the
</span><span> </span><span style="color:#65737e;">// closure will be executed.
</span><span> </span><span style="color:#65737e;">//
</span><span> </span><span style="color:#65737e;">// Note: We must use `remote.spawn()` instead of `handle.spawn()` because the
</span><span> </span><span style="color:#65737e;">// Core was created on a different thread.
</span><span> remote.</span><span style="color:#96b5b4;">spawn</span><span>(|_| {
</span><span>
</span><span> </span><span style="color:#65737e;">// Use the `.then()` combinator to get the result of our "fake work" so we
</span><span> </span><span style="color:#65737e;">// can send it through the channel.
</span><span> f.</span><span style="color:#96b5b4;">then</span><span>(|</span><span style="color:#bf616a;">res</span><span>| {
</span><span>
</span><span> </span><span style="color:#65737e;">// Using `tx`, the result of the above work can be sent over the
</span><span> </span><span style="color:#65737e;">// channel. Note that we also add the `.then()` combinator. Any
</span><span> </span><span style="color:#65737e;">// future passed to `handle.spawn()` must be of type
</span><span> </span><span style="color:#65737e;">// `Future<Item=(), Error=()>`. In the case of `tx.send()`, the
</span><span> </span><span style="color:#65737e;">// `tx` (Sink) will be returned if the result was successfully
</span><span> </span><span style="color:#65737e;">// flushed or a `SinkError` if the result could not be flushed.
</span><span> tx
</span><span> .</span><span style="color:#96b5b4;">send</span><span>(res)
</span><span> .</span><span style="color:#96b5b4;">then</span><span>(|</span><span style="color:#bf616a;">tx</span><span>| {
</span><span> </span><span style="color:#b48ead;">match</span><span> tx {
</span><span> Ok(_tx) => {
</span><span> info!("</span><span style="color:#a3be8c;">Sink flushed</span><span>");
</span><span> Ok(())
</span><span> }
</span><span> Err(e) => {
</span><span> error!("</span><span style="color:#a3be8c;">Sink failed! {:?}</span><span>", e);
</span><span> Err(())
</span><span> }
</span><span> }
</span><span> }) </span><span style="color:#65737e;">// <-- no semi-colon here! Result of `tx.send.then()` is a future.
</span><span> }) </span><span style="color:#65737e;">// <-- no semi-colon here! Result of `f.then()` will be spawned.
</span><span> });
</span><span> }
</span><span> });
</span><span>
</span><span> </span><span style="color:#65737e;">// I created a `Stats` type here. I could have use something like `counter: usize`,
</span><span> </span><span style="color:#65737e;">// but that implements `Copy`. I dislike examples that use types that implement
</span><span> </span><span style="color:#65737e;">// `Copy` because they are deceptively easier to make work.
</span><span> </span><span style="color:#b48ead;">let mut</span><span> stats = Stats { success: </span><span style="color:#d08770;">0</span><span>, failure: </span><span style="color:#d08770;">0 </span><span>};
</span><span>
</span><span> </span><span style="color:#65737e;">// As mentioned above, rx is a stream. That means we are expecting multiple _future_
</span><span> </span><span style="color:#65737e;">// values. Here we use `for_each` to yield each value as it comes through the channel.
</span><span> </span><span style="color:#b48ead;">let</span><span> f2 = rx.</span><span style="color:#96b5b4;">for_each</span><span>(|</span><span style="color:#bf616a;">res</span><span>| {
</span><span>
</span><span> </span><span style="color:#65737e;">// Remember that our fake work as modeled as `::futures::result()`. We need to
</span><span> </span><span style="color:#65737e;">// check if the future returned the `Ok` or `Err` variant and increment the
</span><span> </span><span style="color:#65737e;">// counter accordingly.
</span><span> </span><span style="color:#b48ead;">match</span><span> res {
</span><span> Ok(_) => stats.success += </span><span style="color:#d08770;">1</span><span>,
</span><span> Err(_) => stats.failure += </span><span style="color:#d08770;">1</span><span>,
</span><span> }
</span><span> info!("</span><span style="color:#a3be8c;">stats = {:?}</span><span>", stats);
</span><span>
</span><span> </span><span style="color:#65737e;">// The stream will stop on `Err`, so we need to return `Ok`.
</span><span> Ok(())
</span><span> });
</span><span>
</span><span> </span><span style="color:#65737e;">// The executor is started by the call to `core.run()` and will finish once the `f2`
</span><span> </span><span style="color:#65737e;">// future is finished. Keep in mind that since `rx` is a stream, it will not finish
</span><span> </span><span style="color:#65737e;">// until there is an error. Using a stream with `core.run()` is a common pattern and
</span><span> </span><span style="color:#65737e;">// is how servers are normally implemented.
</span><span> core.</span><span style="color:#96b5b4;">run</span><span>(f2).</span><span style="color:#96b5b4;">expect</span><span>("</span><span style="color:#a3be8c;">Core failed to run</span><span>");
</span><span>}
</span></code></pre>
http://activitystrea.ms/schema/1.0/postInitial v0.1.0 release of weldr - a reverse proxy written in Rust2017-02-15T00:00:00+00:002017-02-15T00:00:00+00:00Herman J. Radtke IIIhttp://activitystrea.ms/schema/1.0/personhttps://hermanradtke.com/about/https://hermanradtke.com/2017/02/15/weldr-reverse-proxy-initial-release-rust.html/<p>Note: This project was originally named <a href="https://github.com/hjr3/alacrity/issues/69">alacrity</a>.</p>
<p>Over the past few months I have been working on a building a reverse proxy, called <a href="https://github.com/hjr3/weldr">weldr</a>, in Rust. I have just published the initial <a href="https://github.com/hjr3/weldr/releases/tag/0.1.0">release</a> of weldr. I have been interested in doing something with networks in Rust. I have spent a lot of time building hypermedia APIs, so doing something with HTTP seemed like a good fit. I started out using mio and later switched to <a href="https://github.com/tokio-rs/tokio">tokio</a>. While this was fun to do, I quickly realized that I was spending most of my time implementing the HTTP spec instead of building the features I was most excited about. The <a href="https://github.com/hyperium/hyper">hyper</a> HTTP library was also switching over to tokio and way ahead of where I was at. After talking with Sean, I made the decision to use hyper as the foundation for weldr moving forward. There are still a number of things a proxy must do in order to conform with the various HTTP related RFCs. I will be working on those proxy specific requirements while adding the features that I want in a reverse proxy.</p>
<p>There are two general problems I am trying to solve with weldr. The first problem is that popular open source proxies do not work as well as I would like them to in dynamic cloud/container environments. The reason is that dynamic parts of a proxy, such as the list of backend servers, are defined by a configuration file that is read when the proxy is started. I want to build a proxy that has a minimal configuration file and drive most of the behavior through a set of APIs. This may make it harder to use weldr for simple use cases, but I hope weldr can make more complex environments a lot easier. I will note that there are some products that do this now, such as NGINX Plus, but those are cost prohibitive for many.</p>
<p>The second, more aspirational, problem is around the <em>availability</em> of the proxy. If I want to put a reverse proxy in front of a critical cluster of web servers, I have two basic options: use an active/passive setup or use DNS. For an active/passive setup, the go to is <a href="http://www.keepalived.org/">keepalived</a>. I think keepalived does a great job, but it is a real pain to setup and ensure it is working correctly. Even more so if you are trying to automate the creation of servers. I want to start mulitple weldr proxy servers and have them automatically determine a leader with one or more followers. This means that keepalived kind of logic must be embedded inside weldr. I plan on using Raft to accomplish this. The current Rust raft library is missing some features that I require though. I hope to contribute those at some point in the future. The other solution, DNS, is even more challenging to get working correctly. It may be necessary though in order to handle a large number of requests. I have some very early thoughts about making this work better as well. I hope to write about those thoughts more in the future.</p>
<p>Weldr is definitely not production ready, but it is working well enough that you can play with it. I would love any feedback on what I have currently done so far and my plans for the future.</p>
http://activitystrea.ms/schema/1.0/postUsing and_then and map combinators on the Rust Result Type2016-09-12T00:00:00+00:002016-09-12T00:00:00+00:00Herman J. Radtke IIIhttp://activitystrea.ms/schema/1.0/personhttps://hermanradtke.com/about/https://hermanradtke.com/2016/09/12/rust-using-and_then-and-map-combinators-on-result-type.html/<p>If you have spent any amount of time learning Rust, you quickly become accustomed to <code>Option</code> and <code>Result</code> types. It is through these two core types that we make our programs reliable. My background is with C and dynamic languages. I found it easiest to use the <code>match</code> keyword when working with these types. There are also combinator functions like <code>map</code> and <code>and_then</code> which allow a set of computations to be chained together. I like to chain combinators together so error logic is separated from the main logic of the code.</p>
<p>I recently returned home from RustConf 2016 where the <a href="https://github.com/alexcrichton/futures-rs">futures</a> crate had a <a href="https://crates.io/crates/futures/0.1.1">0.1.1</a> release along with the first glimpses of <a href="https://github.com/tokio-rs/tokio">tokio</a>. All futures implement a <code>poll</code> function that returns a <a href="https://github.com/alexcrichton/futures-rs/blob/9bd186bef3430d26747ee886c54d5e68e0405275/src/lib.rs#L354">Poll</a> type. The <code>Poll</code> type is defined as <code>pub type Poll<T, E> = Result<Async<T>, E>;</code>. ~~Thus, if we want to use futures, we need to be comfortable with combinator functions implemented on the core <code>Result</code> type. You will not be able to fall back on using the <code>match</code> keyword.~~Many of the examples that I have seen used combinator functions to chain futures together. We can look at how <code>and_then</code> and <code>map</code> combinators work on the <code>Result</code> type and get a better understanding of how combinators work without the additional mental load of trying to understand how futures work. Once we are comfortable with combinators, we should be better able to understand the examples that use combinators to chain futures together. (Edit: Revised the previous sentence per the discussion on <a href="https://www.reddit.com/r/rust/comments/52lsbb/using_and_then_and_map_combinators_on_the_rust/d7lbf4n">r/rust</a>).</p>
<h3 id="approach"><a class="zola-anchor" href="#approach" aria-label="Anchor link for: approach">Approach</a></h3>
<p>I will be providing explicit types throughout the examples to make it easier to understand what is happening. In the vast majority of cases, you can let compiler infer the types. In fact, it is idiomatic to let the compiler infer the types.</p>
<p>The <a href="https://doc.rust-lang.org/std/result/enum.Result.html#method.and_then">Standard Library API Reference</a> for <code>Result</code> combinators does a good job with explanations and simple examples. However, most examples use the same type for both the <code>Ok</code> and <code>Err</code> variants. I think this makes it harder to understand what is going on. I will be using a <code>Err(&'static str)</code> variant the examples so I can use easy to identify error messages. If the <code>'static</code> lifetime confuses you, know that <code>&'static str</code> means hard-coded string literal. Example: <code>let foo: &'static str = "Hello World!";</code>.</p>
<h2 id="and-then-combinator"><a class="zola-anchor" href="#and-then-combinator" aria-label="Anchor link for: and-then-combinator"><code>and_then</code> Combinator</a></h2>
<p>Let us start with the <code>and_then</code> combinator function. The <code>and_then</code> combinator is a function that calls a closure if, and only if, the variant of the <code>Result</code> enum type is <code>Ok(T)</code>.</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">let</span><span> res: Result<</span><span style="color:#b48ead;">usize</span><span>, &</span><span style="color:#b48ead;">'static str</span><span>> = Ok(</span><span style="color:#d08770;">5</span><span>);
</span><span style="color:#b48ead;">let</span><span> value = res.</span><span style="color:#96b5b4;">and_then</span><span>(|</span><span style="color:#bf616a;">n</span><span>: </span><span style="color:#b48ead;">usize</span><span>| Ok(n * </span><span style="color:#d08770;">2</span><span>));
</span><span>assert_eq!(Ok(</span><span style="color:#d08770;">10</span><span>), value);
</span></code></pre>
<p>In this first example, the value of <code>res</code> is <code>Ok(5)</code>. Per our definition of <code>and_then</code>: <code>and_then</code> will match on the <code>Ok</code> variant and call the closure with the <code>usize</code> value of 5 as the argument. What happens if <code>res</code> is an <code>Err</code> variant?</p>
<pre style="background-color:#2b303b;color:#c0c5ce;"><code><span>let res: Result<usize, &'static str> = Err("error");
</span><span>let value = res.and_then(|n: usize| Ok(n * 2)); // <--- closure is not called
</span><span>assert_eq!(Err("error"), value);```
</span></code></pre>
<p>In this second example, the value of <code>res</code> is <code>Err("error")</code>. Per our definition of <code>and_then</code>: <code>and_then</code> will match on the <code>Err</code> variant and <em>skip</em> calling the closure. The value of <code>Err("error")</code> will be returned as is. This is convenient as we were able to write a closure that ignored errors. The value <code>Err("error")</code> will be passed along in the background to the end of the combinator chain. So far we have only been returning <code>Ok</code> from closure. Our closure can also return an <code>Err</code> too.</p>
<h3 id="chaining-multiple-and-then-functions"><a class="zola-anchor" href="#chaining-multiple-and-then-functions" aria-label="Anchor link for: chaining-multiple-and-then-functions">Chaining Multiple <code>and_then</code> Functions</a></h3>
<p>Instead of multiplying, let us divide 2 by the result <code>n</code>. To protect against division by zero errors, we need to add another step in the chain that will return an error if the value is zero.</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">let</span><span> res: Result<</span><span style="color:#b48ead;">usize</span><span>, &</span><span style="color:#b48ead;">'static str</span><span>> = Ok(</span><span style="color:#d08770;">0</span><span>);
</span><span>
</span><span style="color:#b48ead;">let</span><span> value = res
</span><span> .</span><span style="color:#96b5b4;">and_then</span><span>(|</span><span style="color:#bf616a;">n</span><span>: </span><span style="color:#b48ead;">usize</span><span>| {
</span><span> </span><span style="color:#b48ead;">if</span><span> n == </span><span style="color:#d08770;">0 </span><span>{
</span><span> Err("</span><span style="color:#a3be8c;">cannot divide by zero</span><span>")
</span><span> } </span><span style="color:#b48ead;">else </span><span>{
</span><span> Ok(n)
</span><span> }
</span><span> })
</span><span> .</span><span style="color:#96b5b4;">and_then</span><span>(|</span><span style="color:#bf616a;">n</span><span>: </span><span style="color:#b48ead;">usize</span><span>| Ok(</span><span style="color:#d08770;">2 </span><span>/ n)); </span><span style="color:#65737e;">// <--- closure is not called
</span><span>
</span><span>assert_eq!(Err("</span><span style="color:#a3be8c;">cannot divide by zero</span><span>"), value);
</span></code></pre>
<p>The initial value of <code>Ok(0)</code> will be passed to the first closure. In this case, <code>n</code> does equal <code>0</code> and the closure returns <code>Err("cannot divide by zero")</code>. Our next call to <code>and_then</code> identifies that we now have an <code>Err</code> variant of <code>Result</code> and does not call the closure.</p>
<h3 id="flattening-results"><a class="zola-anchor" href="#flattening-results" aria-label="Anchor link for: flattening-results">Flattening Results</a></h3>
<p>There are times when we have nested <code>Result</code> types. It is generally a good strategy to try and flatten the result out. For example, we can flatten <code>Result<Result<usize, &'static str>, &'static str></code> to <code>Result<usize, &'static str></code>. A flatter <code>Result</code> is generally easier for later code to deal with.</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">let</span><span> res: Result<Result<</span><span style="color:#b48ead;">usize</span><span>, &</span><span style="color:#b48ead;">'static str</span><span>>, &</span><span style="color:#b48ead;">'static str</span><span>> = Ok(Ok(</span><span style="color:#d08770;">5</span><span>));
</span><span>
</span><span style="color:#b48ead;">let</span><span> value = res
</span><span> .</span><span style="color:#96b5b4;">and_then</span><span>(|</span><span style="color:#bf616a;">n</span><span>: Result<</span><span style="color:#b48ead;">usize</span><span>, &</span><span style="color:#b48ead;">'static str</span><span>>| {
</span><span> n </span><span style="color:#65737e;">// <--- this is either Ok(usize) or Err(&'static str)
</span><span> })
</span><span> .</span><span style="color:#96b5b4;">and_then</span><span>(|</span><span style="color:#bf616a;">n</span><span>: </span><span style="color:#b48ead;">usize</span><span>| {
</span><span> Ok(n * </span><span style="color:#d08770;">2</span><span>)
</span><span> });
</span><span>
</span><span>assert_eq!(Ok(</span><span style="color:#d08770;">10</span><span>), value);
</span></code></pre>
<p>In the above example, the first <code>and_then</code> closure is returning <code>n</code>. Note that in previous examples, we were wrapping our return value in either the <code>Ok</code> or <code>Err</code> variant of the <code>Result</code> enum. In this example, our goal is to flatten the result so we will not explicitly return <code>Ok</code> or <code>Err</code>. The value of <code>n</code> is going to be either <code>Ok(usize)</code> or <code>Err(&'static str)</code>. As such, we can return <code>n</code> as it is. If the value of <code>n</code> is of type <code>Ok(usize)</code> then the value will be passed to the next <code>and_then</code> as expected. If the value of <code>n</code> is of type <code>Err(&'static str)</code> then the second <code>and_then</code> function will be bypassed.</p>
<p>The <code>and_then</code> function is called flatMap in scala and you can see why. We are flattening the type from <code>Result<Result<_, _>, _></code> to <code>Result<_, _></code> by <em>mapping</em> variants in the internal <code>Result</code> to the outer <code>Result</code>.</p>
<h2 id="map-combinator"><a class="zola-anchor" href="#map-combinator" aria-label="Anchor link for: map-combinator"><code>map</code> Combinator</a></h2>
<p>So far we have been using <code>and_then</code> to combine computation and flatten our nested <code>Result</code>s. The examples have been using <code>Result</code>s with types that are the same types we wanted to end up with. Sometimes we are given a <code>Result</code> where one or both variants are not the type we want. We will use <code>map</code> to transform one <code>Result</code> type into another.</p>
<h3 id="basics"><a class="zola-anchor" href="#basics" aria-label="Anchor link for: basics">Basics</a></h3>
<p>If you primarily use a dynamically typed language, you may have used <code>map</code> as a replacement for iterating/looping over a list of values. We can do this same thing in Rust too.</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">let</span><span> res: Vec<</span><span style="color:#b48ead;">usize</span><span>> = vec![</span><span style="color:#d08770;">5</span><span>];
</span><span style="color:#b48ead;">let</span><span> value: Vec<</span><span style="color:#b48ead;">usize</span><span>> = res.</span><span style="color:#96b5b4;">iter</span><span>().</span><span style="color:#96b5b4;">map</span><span>(|</span><span style="color:#bf616a;">n</span><span>| n * </span><span style="color:#d08770;">2</span><span>).</span><span style="color:#96b5b4;">collect</span><span>();
</span><span>assert_eq!(vec![</span><span style="color:#d08770;">10</span><span>], value);
</span></code></pre>
<p>Using <code>map</code> with a <code>Result</code> type is a little different. The <code>map</code> function calls a closure if, and only if, the variant of the <code>Result</code> enum is <code>Ok(T)</code>. Here is our very first <code>and_then</code> example, but using <code>map</code> instead.</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">let</span><span> res: Result<</span><span style="color:#b48ead;">usize</span><span>, &</span><span style="color:#b48ead;">'static str</span><span>> = Ok(</span><span style="color:#d08770;">5</span><span>);
</span><span style="color:#b48ead;">let</span><span> value: Result<</span><span style="color:#b48ead;">usize</span><span>, &</span><span style="color:#b48ead;">'static str</span><span>> = res.</span><span style="color:#96b5b4;">map</span><span>(|</span><span style="color:#bf616a;">n</span><span>| n * </span><span style="color:#d08770;">2</span><span>);
</span><span>assert_eq!(Ok(</span><span style="color:#d08770;">10</span><span>), value);
</span></code></pre>
<p>This looks very similar to the first <code>and_then</code> example, but notice that we returned <code>Ok(n * 2)</code> in <code>and_then</code> example and we are returning <code>n * 2</code> in this example. The <code>map</code> function <em>always</em> wraps the return value of the closure in the <code>Ok</code> variant.</p>
<h3 id="mapping-the-ok-result-variant-to-another-type"><a class="zola-anchor" href="#mapping-the-ok-result-variant-to-another-type" aria-label="Anchor link for: mapping-the-ok-result-variant-to-another-type">Mapping the <code>Ok</code> Result Variant To Another Type</a></h3>
<p>Let us look at an example where the <code>Ok(T)</code> variant of the <code>Result</code> enum is of the wrong type. Example: We are given <code>Result<i32, _></code>, but we want <code>Result<usize, _></code>.</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">let</span><span> given: Result<</span><span style="color:#b48ead;">i32</span><span>, &</span><span style="color:#b48ead;">'static str</span><span>> = Ok(</span><span style="color:#d08770;">5</span><span style="color:#b48ead;">i32</span><span>);
</span><span style="color:#b48ead;">let</span><span> desired: Result<</span><span style="color:#b48ead;">usize</span><span>, &</span><span style="color:#b48ead;">'static str</span><span>> = given.</span><span style="color:#96b5b4;">map</span><span>(|</span><span style="color:#bf616a;">n</span><span>: </span><span style="color:#b48ead;">i32</span><span>| n as </span><span style="color:#b48ead;">usize</span><span>);
</span><span>
</span><span>assert_eq!(Ok(</span><span style="color:#d08770;">5</span><span style="color:#b48ead;">usize</span><span>), desired);
</span><span>
</span><span style="color:#b48ead;">let</span><span> value = desired.</span><span style="color:#96b5b4;">and_then</span><span>(|</span><span style="color:#bf616a;">n</span><span>: </span><span style="color:#b48ead;">usize</span><span>| Ok(n * </span><span style="color:#d08770;">2</span><span>));
</span><span>
</span><span>assert_eq!(Ok(</span><span style="color:#d08770;">10</span><span>), value);
</span></code></pre>
<p>In this example, the value of <code>res</code> is <code>Ok(5i32)</code>. Per our definition of <code>map</code>, <code>map</code> will match on the <code>Ok</code> variant and call the closure with the <code>i32</code> value of 5 as the argument. When the closure returns a value, <code>map</code> will wrap that value in <code>Ok</code> and return it.</p>
<p>If the given value is an <code>Err</code> variant, it is passed through both the <code>map</code> and <code>and_then</code> functions without the closure being called.</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">let</span><span> given: Result<</span><span style="color:#b48ead;">i32</span><span>, &</span><span style="color:#b48ead;">'static str</span><span>> = Err("</span><span style="color:#a3be8c;">an error</span><span>");
</span><span style="color:#b48ead;">let</span><span> desired: Result<</span><span style="color:#b48ead;">usize</span><span>, &</span><span style="color:#b48ead;">'static str</span><span>> = given.</span><span style="color:#96b5b4;">map</span><span>(|</span><span style="color:#bf616a;">n</span><span>: </span><span style="color:#b48ead;">i32</span><span>| n as </span><span style="color:#b48ead;">usize</span><span>); </span><span style="color:#65737e;">// <--- closure not called
</span><span>
</span><span>assert_eq!(Err("</span><span style="color:#a3be8c;">an error</span><span>"), desired);
</span><span>
</span><span style="color:#b48ead;">let</span><span> value = desired.</span><span style="color:#96b5b4;">and_then</span><span>(|</span><span style="color:#bf616a;">n</span><span>: </span><span style="color:#b48ead;">usize</span><span>| Ok(n * </span><span style="color:#d08770;">2</span><span>)); </span><span style="color:#65737e;">// <--- closure not called
</span><span>
</span><span>assert_eq!(Err("</span><span style="color:#a3be8c;">an error</span><span>"), value);
</span></code></pre>
<h3 id="mapping-both-variants-of-result"><a class="zola-anchor" href="#mapping-both-variants-of-result" aria-label="Anchor link for: mapping-both-variants-of-result">Mapping Both Variants of Result</a></h3>
<p>What if both variants of the Result were different? Example: We are given <code>Result<i32, MyError></code>, but we want <code>Result<usize, &'static str></code>.</p>
<p>We only transform the <code>Ok(i32)</code> variant in the above example. In this example, we will need to also transform the <code>Err(MyError)</code> variant into <code>Err(&'static str)</code>. In order to do this, we will need to use <code>map_err</code> to handle the <code>Err(E)</code> variant. The <code>map_err</code> combinator function is the opposite of <code>map</code> because it matches only on <code>Err(E)</code> variants of <code>Result</code>.</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">enum </span><span>MyError { Bad };
</span><span>
</span><span style="color:#b48ead;">let</span><span> given: Result<</span><span style="color:#b48ead;">i32</span><span>, MyError> = Err(MyError::Bad);
</span><span>
</span><span style="color:#b48ead;">let</span><span> desired: Result<</span><span style="color:#b48ead;">usize</span><span>, &</span><span style="color:#b48ead;">'static str</span><span>> = given
</span><span> .</span><span style="color:#96b5b4;">map</span><span>(|</span><span style="color:#bf616a;">n</span><span>: </span><span style="color:#b48ead;">i32</span><span>| {
</span><span> n as </span><span style="color:#b48ead;">usize
</span><span> })
</span><span> .</span><span style="color:#96b5b4;">map_err</span><span>(|</span><span style="color:#bf616a;">_e</span><span>: MyError| {
</span><span> "</span><span style="color:#a3be8c;">bad MyError</span><span>"
</span><span> });
</span><span>
</span><span style="color:#b48ead;">let</span><span> value = desired.</span><span style="color:#96b5b4;">and_then</span><span>(|</span><span style="color:#bf616a;">n</span><span>: </span><span style="color:#b48ead;">usize</span><span>| Ok(n * </span><span style="color:#d08770;">2</span><span>));
</span><span>
</span><span>assert_eq!(Err("</span><span style="color:#a3be8c;">bad MyError</span><span>"), value);
</span></code></pre>
<p>You must understand that:</p>
<ul>
<li><code>map</code> only handles the <code>Ok(T)</code> variant of <code>Result</code></li>
<li><code>map_err</code> only handle the <code>Err(E)</code> variant of <code>Result</code></li>
</ul>
<h2 id="different-return-types-using-and-then-and-map"><a class="zola-anchor" href="#different-return-types-using-and-then-and-map" aria-label="Anchor link for: different-return-types-using-and-then-and-map">Different Return Types Using <code>and_then</code> And <code>map</code></a></h2>
<p>The <code>and_then</code>, <code>map</code> and <code>map_err</code> functions are not constrained to return the same type inside their variants. The <code>map</code> functions can be given <code>Ok(T)</code> and return <code>Ok(U)</code>. The <code>map_err</code> function can be given <code>Err(E)</code> and return <code>Err(F)</code>. The <code>and_then</code> function can be given <code>Ok(T)</code> and return <code>Ok(U)</code> or <code>Err(F)</code>!</p>
<p>Let us try a complicated example where we are given a nested Result, but none of the types match the desired types we want. Example: We are given <code>Result<Result<i32, FooError>, BarError></code>, but we want <code>Result<usize, &'static str></code>.</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">enum </span><span>FooError {
</span><span> Bad,
</span><span>}
</span><span>
</span><span style="color:#b48ead;">enum </span><span>BarError {
</span><span> Horrible,
</span><span>}
</span><span>
</span><span style="color:#b48ead;">let</span><span> res: Result<Result<</span><span style="color:#b48ead;">i32</span><span>, FooError>, BarError> = Ok(Err(FooError::Bad));
</span><span>
</span><span style="color:#b48ead;">let</span><span> value = res
</span><span>
</span><span> </span><span style="color:#65737e;">// `map` will only call the closure for `Ok(Result<i32, FooError>)`
</span><span> .</span><span style="color:#96b5b4;">map</span><span>(|</span><span style="color:#bf616a;">res</span><span>: Result<</span><span style="color:#b48ead;">i32</span><span>, FooError>| {
</span><span>
</span><span> </span><span style="color:#65737e;">// transform `Ok(Result<i32, FooError>)` into `Ok(Result<usize, &'static str>)`
</span><span> res
</span><span> </span><span style="color:#65737e;">// transform i32 to usize
</span><span> .</span><span style="color:#96b5b4;">map</span><span>(|</span><span style="color:#bf616a;">n</span><span>: </span><span style="color:#b48ead;">i32</span><span>| n as </span><span style="color:#b48ead;">usize</span><span>)
</span><span>
</span><span> </span><span style="color:#65737e;">// transform `FooError` into `'static str`
</span><span> .</span><span style="color:#96b5b4;">map_err</span><span>(|</span><span style="color:#bf616a;">_e</span><span>: FooError| "</span><span style="color:#a3be8c;">bad FooError</span><span>")
</span><span>
</span><span> })
</span><span>
</span><span> </span><span style="color:#65737e;">// `map_err` will only call the closure for `Err(BarError)`
</span><span> .</span><span style="color:#96b5b4;">map_err</span><span>(|</span><span style="color:#bf616a;">_e</span><span>: BarError| {
</span><span> </span><span style="color:#65737e;">// transform `BarError` into `'static str`
</span><span> "</span><span style="color:#a3be8c;">horrible BarError</span><span>"
</span><span> })
</span><span>
</span><span> </span><span style="color:#65737e;">// `and_then` will only call the closure for `Ok(Result<usize, &'static str>)`
</span><span> </span><span style="color:#65737e;">// Note: this is result of our first `map` above
</span><span> .</span><span style="color:#96b5b4;">and_then</span><span>(|</span><span style="color:#bf616a;">n</span><span>: Result<</span><span style="color:#b48ead;">usize</span><span>, &</span><span style="color:#b48ead;">'static str</span><span>>| {
</span><span> </span><span style="color:#65737e;">// transform (flatten) `Ok(Result<usize, &'static str>)` into `Result<usize, &'static str>`
</span><span> </span><span style="color:#65737e;">// this may be `Ok(Ok(usize))` _or_ `Ok(Err(&'static str))`
</span><span> n
</span><span> })
</span><span>
</span><span> </span><span style="color:#65737e;">// `and_then` will only call the closure for `Ok(usize)`
</span><span> .</span><span style="color:#96b5b4;">and_then</span><span>(|</span><span style="color:#bf616a;">n</span><span>: </span><span style="color:#b48ead;">usize</span><span>| {
</span><span> </span><span style="color:#65737e;">// transform Ok(usize) into Ok(usize * 2)
</span><span> Ok(n * </span><span style="color:#d08770;">2</span><span>)
</span><span> });
</span><span>
</span><span>assert_eq!(Err("</span><span style="color:#a3be8c;">bad FooError</span><span>"), value);
</span><span>}
</span></code></pre>
<p>I decided to inline the explanation into the comments in an effort to make things as clear as possible. You can see how quickly things get complicated. It is my general strategy to try and flatten the nested <code>Result</code> out as early as possible to simplify later combinators.</p>
<h2 id="conclusion"><a class="zola-anchor" href="#conclusion" aria-label="Anchor link for: conclusion">Conclusion</a></h2>
<p>A lot of functions return <code>Result</code> to represent the happy-path value and the error case. Using combinators can help isolate error handling from normal computation. Combinators also allow us to pass along errors all the way to the end. I like the <a href="https://fsharpforfunandprofit.com/rop/">Railway Oriented Programming</a> for a good visualization of this concept. All the examples we went through work on the <code>Option</code> type too. You should now be better equipped to read other code that uses <code>Result</code> combinator functions and writing them yourself.</p>
<h2 id="extras"><a class="zola-anchor" href="#extras" aria-label="Anchor link for: extras">Extras</a></h2>
<h3 id="or-else-combinator"><a class="zola-anchor" href="#or-else-combinator" aria-label="Anchor link for: or-else-combinator"><code>or_else</code> Combinator</a></h3>
<p>The <code>or_else</code> function combinator is the opposite of <code>and_then</code>. It only calls the closure if the result is <code>Err(E)</code>. I do not find myself using <code>or_else</code> as often as <code>and_then</code>. Please feel free to show me what I am missing.</p>
<h3 id="debugging-complex-combinators"><a class="zola-anchor" href="#debugging-complex-combinators" aria-label="Anchor link for: debugging-complex-combinators">Debugging Complex Combinators</a></h3>
<p>I like to make types explicit when trying to get a complex combination working. However, this can get unrealistic when dealing with iterators or futures that become deeply nested. When that happens, I start assigning results to incorrect types. Here is a small example, assuming I am confused as to what type <code>res</code> is:</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#65737e;">// assume it is not clear what type `res` is
</span><span style="color:#b48ead;">let</span><span> res: Result<</span><span style="color:#b48ead;">usize</span><span>, &</span><span style="color:#b48ead;">'static str</span><span>> = Ok(</span><span style="color:#d08770;">5</span><span>);
</span><span style="color:#b48ead;">let</span><span> c: </span><span style="color:#b48ead;">u8 </span><span>= res;
</span></code></pre>
<p>Which generates:</p>
<pre style="background-color:#2b303b;color:#c0c5ce;"><code><span>error[E0308]: mismatched types
</span><span> --> <anon>:5:13
</span><span> |
</span><span>5 | let c: u8 = res;
</span><span> | ^^^ expected u8, found enum `std::result::Result`
</span><span> |
</span><span> = note: expected type `u8`
</span><span> = note: found type `std::result::Result<usize, &'static str>`
</span></code></pre>
<p>I normally use the variable <code>c</code> because I want to <em>see</em> the type of <code>res</code> in the compiler error message. Haha, I know.</p>
<p>Here is an example using it in a combinator:</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">let</span><span> res: Result<</span><span style="color:#b48ead;">usize</span><span>, &</span><span style="color:#b48ead;">'static str</span><span>> = Ok(</span><span style="color:#d08770;">5</span><span>);
</span><span style="color:#b48ead;">let</span><span> value = res.</span><span style="color:#96b5b4;">and_then</span><span>(|</span><span style="color:#bf616a;">wut</span><span>| {
</span><span> </span><span style="color:#b48ead;">let</span><span> c: </span><span style="color:#b48ead;">u8 </span><span>= wut;
</span><span>});
</span></code></pre>
<p>Which generates:</p>
<pre style="background-color:#2b303b;color:#c0c5ce;"><code><span>error[E0308]: mismatched types
</span><span> --> <anon>:6:17
</span><span> |
</span><span>6 | let c: u8 = wut;
</span><span> | ^^^ expected u8, found usize
</span><span>
</span><span>error[E0308]: mismatched types
</span><span> --> <anon>:5:32
</span><span> |
</span><span>5 | let value = res.and_then(|wut| {
</span><span> | ^ expected enum `std::result::Result`, found ()
</span><span> |
</span><span> = note: expected type `std::result::Result<_, &str>`
</span><span> = note: found type `()`
</span><span>
</span><span>error: aborting due to 2 previous errors
</span></code></pre>
<p>The compiler errors show both the expected input and expected output. I find this really useful when I get lost in all the combinators.</p>
<h4 id="nightly-error-format"><a class="zola-anchor" href="#nightly-error-format" aria-label="Anchor link for: nightly-error-format">Nightly Error Format</a></h4>
<p>As of this writing, Rust <code>1.11.0</code> is the stable version. Rust <code>1.11.0</code> does not have the new error format that is present in Rust nightly. If I am struggling on a compiler error, I often switch over to using Rust nightly until I solve the error. <a href="https://rustup.rs/">Rustup</a> makes this easy.</p>
<p>In your current working directory:</p>
<ul>
<li>Switch to nightly - <code>rustup override set nightly</code></li>
<li>Switch to stable - <code>rustup override set stable</code></li>
</ul>
http://activitystrea.ms/schema/1.0/postIntroduction to nom: a parsing framework written in Rust2016-08-08T00:00:00+00:002016-08-08T00:00:00+00:00Herman J. Radtke IIIhttp://activitystrea.ms/schema/1.0/personhttps://hermanradtke.com/about/https://hermanradtke.com/2016/08/08/introduction-to-nom-rust-parsing-combinator-framework.html/<p>This is an introduction to a parsing library called <a href="https://github.com/Geal/nom">nom</a>. The <code>nom</code> crate is written by Geoffrey Couprie, aka <a href="https://github.com/Geal">Geal</a>, and is a remarkably complete and powerful library for building parsers. I recently did a lot of parsing of bytes on the wire for my <a href="https://github.com/hjr3/carp-rs">carp</a> library and it was a lot of work. I wish I had come across the <code>nom</code> library before I had done all of that.</p>
<p>The description of <code>nom</code> is a <em>Rust parser combinator framework</em> which can sound a little initimdating. Another way of saying this is that <code>nom</code> uses a lot of small functions and macros that make parsing code easy to write and read. I will say that <code>nom</code> can be a bit intimidating to start using. The API has a lot of surface area to learn and the error messages can be hard to understand. The cryptic error messages are due to the use of macros and not anything specific to nom. While it can take a little bit of effort to get started using <code>nom</code>, but I think it is well worth it.</p>
<h2 id="parsing-text"><a class="zola-anchor" href="#parsing-text" aria-label="Anchor link for: parsing-text">Parsing Text</a></h2>
<p>The <code>nom</code> library can parse pretty much anything, but let us start with text. When parsing text, we might be tempted to reach for something like regular expressions. This is an alternative approach that leverages Rust's typing. Also, <code>nom</code> is probably faster and more efficient than any regular expression we might write. The first thing to understand about <code>nom</code> is that it only deals in byte arrays (<code>&[u8</code>). Our text to parse will most likely be in the form of a string. We can convert a string to a byte array using <code>.to_bytes().</code> To get usable results from our parser, we must convert (or map) a matched sequence of bytes into the type that we want. Knowing this, let us start looking at how to parse text input.</p>
<p>The bread and butter of our parsing is going to be the use of the <code>tag!</code> and <code>map_res!</code> macros. The <code>tag!</code> macro consumes the specified string from the byte array. For example, if we had a string of <code>"hello Herman"</code>, we would specify <code>tag!("hello")</code> to parse out the first word. The <code>tag!</code> macro works great when we know what string we want to match. It does not work for dynamic strings. The <code>map_res!</code> macro will be used for dynamic input.</p>
<p>We have to write a more abstract parser for dynamic strings. We can parse the <code>"Herman"</code> part of the string using the <code>alpha</code> function. The <code>alpha</code> function is provided by <code>nom</code> and will return the longest list of alphabetic characters it finds as a byte array. If it is dynamic, it probably means this is part of the input we want to capture. Getting back a byte array of <code>&['H', 'e', 'r', 'm', 'a', 'n']</code> is not ideal work with. We want to convert that into a string. Using <code>map_res!</code> we can map (convert) the byte array into a string: <code>map_res!(alpha, std::str::from_utf8)</code>. The <code>map_res!</code> (map result) macro is known as a combinator. We are <em>combining</em> the <code>alpha</code> function with the <a href="https://doc.rust-lang.org/std/str/fn.from_utf8.html">std::str::from_utf8</a> function. The <code>std::str::from_utf8</code> function is part of the Rust standard library and converts a slice of bytes (or a byte array) into a UTF8 encoded string. So <code>map_res!(alpha, std::str::from_utf8)</code> is saying that we want to grab the longest array of alphabetic characters and then we want to pass that byte array of alphabetic characters to the <code>std::str::from_utf8</code> function.</p>
<p>Now that we can parse both parts of <code>hello Herman</code>, we can put it all together into a more complex parser:</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span>#[</span><span style="color:#bf616a;">macro_use</span><span>]
</span><span style="color:#b48ead;">extern crate</span><span> nom;
</span><span>
</span><span style="color:#b48ead;">use </span><span>nom::{IResult, space, alpha, alphanumeric, digit};
</span><span>
</span><span>named!(name_parser<&</span><span style="color:#b48ead;">str</span><span>>,
</span><span> chain!(
</span><span> tag!("</span><span style="color:#a3be8c;">hello</span><span>") ~
</span><span> space? ~
</span><span> name: map_res!(
</span><span> alpha,
</span><span> std::str::from_utf8
</span><span> ) ,
</span><span>
</span><span> || name
</span><span> )
</span><span>);
</span></code></pre>
<p>In the above example, we are using the <code>named!</code> macro to create a parser function named <code>name_parser</code>. We specify the <code>&str</code> type as the return type of our parser. We use the <code>chain!</code> combinator macro to apply a series to parsers and assemble their results. We use the <code>~</code> character as the separator between parser functions/macros and a <code>,</code> to denote the end of the parser chain. The last part of the <code>chain!</code> combinator takes a closure, <code>|| name</code>, where we can use the previously defined <code>name</code> variable. We now have a function <code>name_parser</code> that accepts a string that begins with <code>hello</code>, has one or more spaces and then contains a series of alpha characters. We map those alpha characters into a string and assign that value to a variable called <code>name</code>. Finally, we return <code>name</code> from the closure, which will be the <code>&str</code> our function returns. Here is a test case proving it:</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span>#[</span><span style="color:#bf616a;">test</span><span>]
</span><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">test_name_parser</span><span>() {
</span><span> </span><span style="color:#b48ead;">let</span><span> empty = &</span><span style="color:#b48ead;">b</span><span>""[..];
</span><span> assert_eq!(</span><span style="color:#96b5b4;">name_parser</span><span>("</span><span style="color:#a3be8c;">hello Herman</span><span>".</span><span style="color:#96b5b4;">as_bytes</span><span>()), IResult::Done(empty, ("</span><span style="color:#a3be8c;">Herman</span><span>")));
</span><span> assert_eq!(</span><span style="color:#96b5b4;">name_parser</span><span>("</span><span style="color:#a3be8c;">hello Kimberly</span><span>".</span><span style="color:#96b5b4;">as_bytes</span><span>()), IResult::Done(empty, ("</span><span style="color:#a3be8c;">Kimberly</span><span>")));
</span><span>}
</span></code></pre>
<p>Notice that the <code>name_parser</code> function does not actually return a <code>&str</code>. It actually returns an <code>IResult</code> type that represents whether the parsing is <code>Done</code>, <code>Incomplete</code> or an <code>Error</code>. If the parsing was successful, the result will be <code>IResult::Done(input_remaining, output)</code>. In our above test, there is no more input left so the byte array is empty. The <em>output</em> is our <code>&str</code> containing the dynamic name.</p>
<p>This might seem like a lot of work for such a basic parser. However, this is building a foundation for creating a lot more complex parsers. For example, we can now create a parser to convert a string of numeric characters into a number:</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#65737e;">// Parse a numerical array into a string and then from a string into a number
</span><span>named!(usize_digit<</span><span style="color:#b48ead;">usize</span><span>>,
</span><span> map_res!(
</span><span> map_res!(
</span><span> digit,
</span><span> std::str::from_utf8
</span><span> ),
</span><span> std::str::FromStr::from_str
</span><span> )
</span><span>);
</span></code></pre>
<p>And we can even go a step further and separate the parsing of a numerical array into a smaller parser called <code>numeric_string</code>. We can then map the result of <code>numeric_string</code> into a <code>usize</code> type in the <code>usize_digit</code> parser function:</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span>named!(numeric_string<&</span><span style="color:#b48ead;">str</span><span>>,
</span><span> map_res!(
</span><span> digit,
</span><span> std::str::from_utf8
</span><span> )
</span><span>);
</span><span>
</span><span>named!(usize_digit<</span><span style="color:#b48ead;">usize</span><span>>,
</span><span> map_res!(
</span><span> numeric_string,
</span><span> std::str::FromStr::from_str
</span><span> )
</span><span>);
</span></code></pre>
<p>Now that we have a generic parser to parse numerical arrays, we can also create a parser to parse a string into a <code>u64</code> using the same <code>numeric_string</code> function defined above:</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span>named!(u64_digit<</span><span style="color:#b48ead;">u64</span><span>>,
</span><span> map_res!(
</span><span> numeric_string,
</span><span> std::str::FromStr::from_str
</span><span> )
</span><span>);
</span></code></pre>
<p>You start to see how powerful the combination of small parsers can be. Not only does <code>nom</code> make it easy to write parsers, I think it also makes it easy to read parsers later and understand what they are doing. It also makes it easier to write tests against smaller parsers to verify their correctness. We have just scratched the surface of what <code>nom</code> can do. There are many other <a href="http://rust.unhandledexpression.com/nom/#functions">parsers</a> and <a href="http://rust.unhandledexpression.com/nom/#macros">combinators</a> available. There are also a number <a href="https://github.com/Geal/nom/issues/14">example</a> <a href="https://github.com/Geal/nom/tree/master/tests">parsers</a> to use as a reference.</p>
http://activitystrea.ms/schema/1.0/postConnecting a webservice to a database in Rust2016-05-23T00:00:00+00:002016-05-23T00:00:00+00:00Herman J. Radtke IIIhttp://activitystrea.ms/schema/1.0/personhttps://hermanradtke.com/about/https://hermanradtke.com/2016/05/23/connecting-webservice-database-rust.html/<p><strong>Note: This blog post does not work with rustc 1.19.0 or later due to a <a href="https://github.com/rust-lang/rust/issues/42460">regression</a> in rust 1.19.0. Use the following to set rust 1.18.0 up:</strong></p>
<pre style="background-color:#2b303b;color:#c0c5ce;"><code><span>$ cd /path/to/project
</span><span>$ rustup install 1.18.0
</span><span>$ rustup override 1.18.0
</span></code></pre>
<p>In this post we are going to hook our <a href="/2016/05/16/creating-a-basic-webservice-in-rust.html">basic webservice</a> up to a database. The webservice will accept a request for <code>/orders</code>, query the database for orders and return a json response. I will be using PostgreSQL in this example. There is a pure Rust <a href="https://crates.io/crates/postgres">PostresSQL driver</a> written by Steven Fackler (sfackler) that I think is well done. That being said, the <a href="https://crates.io/crates/mysql">mysql crate</a> looks well done too.</p>
<p>This post goes into a fair amount of detail. You can skip right to the <a href="https://hermanradtke.com/2016/05/23/connecting-webservice-database-rust.html/#tldr">TL;DR</a> for the final solution.</p>
<h2 id="preparation"><a class="zola-anchor" href="#preparation" aria-label="Anchor link for: preparation">Preparation</a></h2>
<p>We have to get Postgres setup before we start writing Rust code. I am using <a href="https://github.com/jackdb/pg-app-dev-vm">https://github.com/jackdb/pg-app-dev-vm</a> in combination with <a href="https://www.vagrantup.com/">Vagrant</a> to automatically provision a working Postgres instance. Simply clone the git repository and then <code>vagrant up</code>. I am using the default values of <code>myapp</code> for username, <code>dbpass</code> for the password and <code>myapp</code> for the database name. I also have a script called <a href="https://github.com/hjr3/webservice-demo-rs/blob/blog-post-2/db-migrate.sh">db-migrate.sh</a> that will create the orders schema necessary to get this example working.</p>
<h2 id="crate-dependencies"><a class="zola-anchor" href="#crate-dependencies" aria-label="Anchor link for: crate-dependencies">Crate Dependencies</a></h2>
<p>At this point we have a working database instance with an orders tables containing two rows. The first thing we need to do is update our <code>Cargo.toml</code> file with our postgres dependency. We also need to add the <a href="https://crates.io/crates/rustc-serialize">rustc-serialize crate</a> so we can serialize a native Rust data structure into json format. Next time we run <code>cargo build</code> both crates will automatically be downloaded and made available to our webservice.</p>
<pre data-lang="toml" style="background-color:#2b303b;color:#c0c5ce;" class="language-toml "><code class="language-toml" data-lang="toml"><span>[package]
</span><span style="color:#bf616a;">name </span><span>= "</span><span style="color:#a3be8c;">orders</span><span>"
</span><span style="color:#bf616a;">version </span><span>= "</span><span style="color:#a3be8c;">0.1.0</span><span>"
</span><span style="color:#bf616a;">authors </span><span>= ["</span><span style="color:#a3be8c;">Your Name <your.name@example.com></span><span>"]
</span><span>
</span><span>[dependencies]
</span><span style="color:#bf616a;">nickel </span><span>= "</span><span style="color:#a3be8c;">0.8.1</span><span>"
</span><span style="color:#bf616a;">postgres </span><span>= "</span><span style="color:#a3be8c;">0.11.7</span><span>"
</span><span style="color:#bf616a;">rustc-serialize </span><span>= "</span><span style="color:#a3be8c;">0.3.19</span><span>"
</span></code></pre>
<p><em>Note: The rustc-serialize crate works, but it is not being actively developed. The future of json serialization is the <a href="https://crates.io/crates/serde_json">serde_json crate</a>. Unfortunately, serde's ability to automatically serialize data structures is only available on Rust nightly (the version of Rust in active development). Due to this restriction, I have chosen to use rustc-serialize instead.</em></p>
<p>We now need to open up <code>src/main.rs</code> and start adding our dependencies. We need to import the <code>postgres</code> and <code>rustc_serialize</code> crates. These two crates are not exporting macros, so we can leave off the <code>#[macro_use]</code> attribute. Also, notice that the crate name <code>rustc-serialize</code> (hyphen) is imported as <code>rustc_serialize</code> (underbar). The rustc-serialize crate is from early Rust days and the rules around crate names has changed.</p>
<p>Now we will alias which parts of the crates we want to use. We will be using the postgres <code>Connection</code> struct and the <code>SslMode</code> enum. We also will be using the rustc_serialize <code>json</code> module.</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span>#[</span><span style="color:#bf616a;">macro_use</span><span>] </span><span style="color:#b48ead;">extern crate</span><span> nickel;
</span><span style="color:#b48ead;">extern crate</span><span> postgres;
</span><span style="color:#b48ead;">extern crate</span><span> rustc_serialize;
</span><span>
</span><span style="color:#b48ead;">use </span><span>nickel::{Nickel, MediaType};
</span><span style="color:#b48ead;">use </span><span>postgres::{Connection, SslMode};
</span><span style="color:#b48ead;">use </span><span>rustc_serialize::json;
</span></code></pre>
<h2 id="order-struct"><a class="zola-anchor" href="#order-struct" aria-label="Anchor link for: order-struct">Order Struct</a></h2>
<p>We will be querying the database for orders and then mapping the resulting rows into one more objects. Our database schema contains an orders table with an order id, an order total, the type of currency that was used and the status of the order. We need to create an <code>Order</code> struct to map each row to. The postgres crate provides <a href="https://github.com/sfackler/rust-postgres#type-correspondence">type correspondence</a> documentation that maps each Postgres type to a Rust type. Using that information, we can create the <code>Order</code> struct with the correct types.</p>
<p>Once the query result has been mapped into an <code>Order</code> struct, we want to serialize that into json. We could <a href="https://doc.rust-lang.org/rustc-serialize/rustc_serialize/json/index.html#verbose-example-of-tojson-usage">manually implement</a> the <code>ToJson</code> trait that tells rustc_serialize how to convert an <code>Order</code> struct into json, but I do not want to write code unless I have to. Instead, we can use the <code>#[derive()]</code> attribute and automatically generate the trait implementation for <code>RustcEncodable</code>. The <code>RustcEncodable</code> trait will allow us to call <code>json::encode()</code> on our <code>Order</code> struct.</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span>#[</span><span style="color:#bf616a;">derive</span><span>(RustcEncodable)]
</span><span style="color:#b48ead;">struct </span><span>Order {
</span><span> </span><span style="color:#bf616a;">id</span><span>: </span><span style="color:#b48ead;">i32</span><span>,
</span><span> </span><span style="color:#bf616a;">total</span><span>: </span><span style="color:#b48ead;">f64</span><span>,
</span><span> </span><span style="color:#bf616a;">currency</span><span>: String,
</span><span> </span><span style="color:#bf616a;">status</span><span>: String,
</span><span>}
</span></code></pre>
<p>Using <code>#[derive()]</code> can feel a bit like magic. The <code>Order</code> struct is just a shell around some primitive Rust types. The rustc_serialize crate has <a href="https://doc.rust-lang.org/rustc-serialize/rustc_serialize/trait.Encodable.html">already implemeted</a> <code>Encodable</code> for pretty much all the primitive types. As such, the compiler has enough information to automatically implement the <code>RustcEncodable</code> trait for the <code>Order</code> struct. If we had used a type that did not already implement <code>Encodable</code>, then the compiler would have thrown an error.</p>
<p><em>Note: If you are wondering why we derive <code>RustcEncodable</code> to automatically implement the <code>Encoding</code> trait, know that the rustc_serialize crate used to be part of the std library, was deprecated and migrated out to <a href="https://crates.io">crates.io</a>. In order for the rustc_serialize crate not to clash with the code still in the stdlib, the name we derive was modified. You can look to this <a href="https://github.com/rust-lang/rust/commit/a76a80276852f05f30adaa4d2a8a2729b5fc0bfa">commit</a> for more details. This is a unique case. In the vast majority of cases, the name of the trait and the name of the trait we are deriving are the same.</em></p>
<h2 id="database-connection"><a class="zola-anchor" href="#database-connection" aria-label="Anchor link for: database-connection">Database Connection</a></h2>
<p>We are now ready to setup our database connection. Based on the Postgres connection information provided during the <a href="https://hermanradtke.com/2016/05/23/connecting-webservice-database-rust.html/#preparation">Preparation</a> section, we can create a database url. We then create a <code>Connection</code> object that represents our connection to the Postgres database. Using the <code>SslMode</code> enum, we opt to make create the connection over plain-text.</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">main</span><span>() {
</span><span>
</span><span> </span><span style="color:#b48ead;">let</span><span> db_url = "</span><span style="color:#a3be8c;">postgresql://myapp:dbpass@localhost:15432/myapp</span><span>";
</span><span> </span><span style="color:#b48ead;">let</span><span> db = Connection::connect(db_url, SslMode::None)
</span><span> .</span><span style="color:#96b5b4;">expect</span><span>("</span><span style="color:#a3be8c;">Unable to connect to database</span><span>");
</span><span>
</span><span> </span><span style="color:#65737e;">// ...
</span><span>}
</span></code></pre>
<p>Connecting to the database can fail, so <code>Connection::connect()</code> is returning a <a href="https://doc.rust-lang.org/std/result/enum.Result.html">Result</a> type. Most examples you will see choose to <code>.unwrap()</code> the <code>Result</code> type, which would yield the connection on <code>Ok</code> or panic on <code>Err</code>. I will be using <code>.expect()</code> instead of <code>.unwrap()</code>. Using <code>.expect()</code> is just like using <code>.unwrap()</code> except that it allows for a more user-friendly error message if something goes wrong. This will help us debug any issues we may encounter, especially if you are modifying these exmaples.</p>
<h2 id="querying-the-database"><a class="zola-anchor" href="#querying-the-database" aria-label="Anchor link for: querying-the-database">Querying the Database</a></h2>
<p>Let us now jump down to our <code>/orders</code> route and replace the static json response with an actual database result. We create our SQL string to fetch rows from the orders table. We also need to create a mutable <code>orders</code> vector (array) to store the <code>Order</code> objects we are mapping. We then fire off the query by passing in our SQL string and any paramters we wanted to bind. In this case, we have no parameters to bind so we pass a reference to an empty <a href="https://doc.rust-lang.org/std/primitive.slice.html">slice</a> (<code>&[]</code>). We loop over each row in the result, manually convert the result into an <code>Order</code> struct and store it in the <code>orders</code> vector. After all the rows have been converted, we call <code>json::encode()</code> on the orders vector and return that result. Remember, we derived <code>RustcEncodable</code> on the <code>Order</code> struct. The rustc_serialize crate already implemented <code>Encodable</code> on the <code>Vec</code> too. The combination of all these <code>Encodable</code> trait implementations allows for the automatic serialization of <code>orders</code> using <code>json::encode()</code>.</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span> get "</span><span style="color:#a3be8c;">/orders</span><span>" => |</span><span style="color:#bf616a;">_request</span><span>, </span><span style="color:#b48ead;">mut </span><span style="color:#bf616a;">response</span><span>| {
</span><span> </span><span style="color:#b48ead;">let</span><span> query = "</span><span style="color:#a3be8c;">SELECT id, total, currency, status FROM orders</span><span>";
</span><span> </span><span style="color:#b48ead;">let mut</span><span> orders = Vec::new();
</span><span> </span><span style="color:#b48ead;">for</span><span> row in &db.</span><span style="color:#96b5b4;">query</span><span>(query, &[]).</span><span style="color:#96b5b4;">expect</span><span>("</span><span style="color:#a3be8c;">Failed to select orders</span><span>") {
</span><span> </span><span style="color:#b48ead;">let</span><span> order = Order {
</span><span> id: row.</span><span style="color:#96b5b4;">get</span><span>(</span><span style="color:#d08770;">0</span><span>),
</span><span> total: row.</span><span style="color:#96b5b4;">get</span><span>(</span><span style="color:#d08770;">1</span><span>),
</span><span> currency: row.</span><span style="color:#96b5b4;">get</span><span>(</span><span style="color:#d08770;">2</span><span>),
</span><span> status: row.</span><span style="color:#96b5b4;">get</span><span>(</span><span style="color:#d08770;">3</span><span>),
</span><span> };
</span><span>
</span><span> orders.</span><span style="color:#96b5b4;">push</span><span>(order);
</span><span> }
</span><span>
</span><span> response.</span><span style="color:#96b5b4;">set</span><span>(MediaType::Json);
</span><span> json::encode(&orders).</span><span style="color:#96b5b4;">expect</span><span>("</span><span style="color:#a3be8c;">Failed to serialize orders</span><span>")
</span><span> }
</span></code></pre>
<h3 id="sync-error"><a class="zola-anchor" href="#sync-error" aria-label="Anchor link for: sync-error">Sync Error</a></h3>
<p>Below are all the changes we have made so far:</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span>#[</span><span style="color:#bf616a;">macro_use</span><span>] </span><span style="color:#b48ead;">extern crate</span><span> nickel;
</span><span style="color:#b48ead;">extern crate</span><span> postgres;
</span><span style="color:#b48ead;">extern crate</span><span> rustc_serialize;
</span><span>
</span><span style="color:#b48ead;">use </span><span>nickel::{Nickel, MediaType};
</span><span style="color:#b48ead;">use </span><span>postgres::{Connection, SslMode};
</span><span style="color:#b48ead;">use </span><span>rustc_serialize::json;
</span><span>
</span><span>#[</span><span style="color:#bf616a;">derive</span><span>(RustcEncodable)]
</span><span style="color:#b48ead;">struct </span><span>Order {
</span><span> </span><span style="color:#bf616a;">id</span><span>: </span><span style="color:#b48ead;">i32</span><span>,
</span><span> </span><span style="color:#bf616a;">total</span><span>: </span><span style="color:#b48ead;">f64</span><span>,
</span><span> </span><span style="color:#bf616a;">currency</span><span>: String,
</span><span> </span><span style="color:#bf616a;">status</span><span>: String,
</span><span>}
</span><span>
</span><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">main</span><span>() {
</span><span>
</span><span> </span><span style="color:#b48ead;">let</span><span> db_url = "</span><span style="color:#a3be8c;">postgresql://myapp:dbpass@localhost:15432/myapp</span><span>";
</span><span> </span><span style="color:#b48ead;">let</span><span> db = Connection::connect(db_url, SslMode::None)
</span><span> .</span><span style="color:#96b5b4;">expect</span><span>("</span><span style="color:#a3be8c;">Unable to connect to database</span><span>");
</span><span> </span><span style="color:#b48ead;">let mut</span><span> server = Nickel::new();
</span><span>
</span><span> server.</span><span style="color:#96b5b4;">utilize</span><span>(router! {
</span><span> get "</span><span style="color:#a3be8c;">/orders</span><span>" => |</span><span style="color:#bf616a;">_request</span><span>, </span><span style="color:#b48ead;">mut </span><span style="color:#bf616a;">response</span><span>| {
</span><span> </span><span style="color:#b48ead;">let</span><span> query = "</span><span style="color:#a3be8c;">SELECT id, total, currency, status FROM orders</span><span>";
</span><span> </span><span style="color:#b48ead;">let mut</span><span> orders = Vec::new();
</span><span> </span><span style="color:#b48ead;">for</span><span> row in &db.</span><span style="color:#96b5b4;">query</span><span>(query, &[]).</span><span style="color:#96b5b4;">expect</span><span>("</span><span style="color:#a3be8c;">Failed to select orders</span><span>") {
</span><span> </span><span style="color:#b48ead;">let</span><span> order = Order {
</span><span> id: row.</span><span style="color:#96b5b4;">get</span><span>(</span><span style="color:#d08770;">0</span><span>),
</span><span> total: row.</span><span style="color:#96b5b4;">get</span><span>(</span><span style="color:#d08770;">1</span><span>),
</span><span> currency: row.</span><span style="color:#96b5b4;">get</span><span>(</span><span style="color:#d08770;">2</span><span>),
</span><span> status: row.</span><span style="color:#96b5b4;">get</span><span>(</span><span style="color:#d08770;">3</span><span>),
</span><span> };
</span><span>
</span><span> orders.</span><span style="color:#96b5b4;">push</span><span>(order);
</span><span> }
</span><span>
</span><span> response.</span><span style="color:#96b5b4;">set</span><span>(MediaType::Json);
</span><span> json::encode(&orders).</span><span style="color:#96b5b4;">expect</span><span>("</span><span style="color:#a3be8c;">Failed to serialize orders</span><span>")
</span><span> }
</span><span> });
</span><span>
</span><span> server.</span><span style="color:#96b5b4;">listen</span><span>("</span><span style="color:#a3be8c;">127.0.0.1:6767</span><span>");
</span><span>}
</span></code></pre>
<p>In our <code>main</code> function, we setup a connection to the database, create a nickel webserver and define our <code>/orders</code> route. Our <code>/orders</code> route calls a closure that uses the above database connection to fetch orders from the database and then serializes them into json. This looks pretty straight-forward, but if we try to compile this code we will get a rather initimidating error message. If we parse through multi-line error message, we can pull out two peices of information:</p>
<ol>
<li>error: the trait <code>core::marker::Sync</code> is not implemented for the type <code>core::cell::UnsafeCell<postgres::InnerConnection></code></li>
<li><code>core::cell::UnsafeCell<postgres::InnerConnection></code> cannot be shared between threads safely</li>
</ol>
<p>Here in lies the beauty of Rust. The <code>Connection</code> object is not thread safe and, while it may not have been apparent, nickel serves requests in different threads. Rust only allows types that implement the <a href="https://doc.rust-lang.org/std/marker/trait.Sync.html">Sync</a> trait to be shared between threads. For a moment, let us be pragmatic about this. Rather than try and figure out how to make <code>Connection</code> thread safe we will just work around it by establishing the postgres connection as part of the <code>/orders</code> request.</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">main</span><span>() {
</span><span>
</span><span> </span><span style="color:#b48ead;">let</span><span> db_url = "</span><span style="color:#a3be8c;">postgresql://myapp:dbpass@localhost:15432/myapp</span><span>";
</span><span> </span><span style="color:#b48ead;">let mut</span><span> server = Nickel::new();
</span><span>
</span><span> server.</span><span style="color:#96b5b4;">utilize</span><span>(router! {
</span><span> get "</span><span style="color:#a3be8c;">/orders</span><span>" => |</span><span style="color:#bf616a;">_request</span><span>, </span><span style="color:#b48ead;">mut </span><span style="color:#bf616a;">response</span><span>| {
</span><span> </span><span style="color:#b48ead;">let</span><span> db = Connection::connect(db_url, SslMode::None)
</span><span> .</span><span style="color:#96b5b4;">expect</span><span>("</span><span style="color:#a3be8c;">Unable to connect to database</span><span>");
</span><span> </span><span style="color:#b48ead;">let</span><span> query = "</span><span style="color:#a3be8c;">SELECT id, total, currency, status FROM orders</span><span>";
</span><span> </span><span style="color:#b48ead;">let mut</span><span> orders = Vec::new();
</span><span> </span><span style="color:#b48ead;">for</span><span> row in &db.</span><span style="color:#96b5b4;">query</span><span>(query, &[]).</span><span style="color:#96b5b4;">expect</span><span>("</span><span style="color:#a3be8c;">Failed to select orders</span><span>") {
</span><span> </span><span style="color:#b48ead;">let</span><span> order = Order {
</span><span> id: row.</span><span style="color:#96b5b4;">get</span><span>(</span><span style="color:#d08770;">0</span><span>),
</span><span> total: row.</span><span style="color:#96b5b4;">get</span><span>(</span><span style="color:#d08770;">1</span><span>),
</span><span> currency: row.</span><span style="color:#96b5b4;">get</span><span>(</span><span style="color:#d08770;">2</span><span>),
</span><span> status: row.</span><span style="color:#96b5b4;">get</span><span>(</span><span style="color:#d08770;">3</span><span>),
</span><span> };
</span><span>
</span><span> orders.</span><span style="color:#96b5b4;">push</span><span>(order);
</span><span> }
</span><span>
</span><span> response.</span><span style="color:#96b5b4;">set</span><span>(MediaType::Json);
</span><span> json::encode(&orders).</span><span style="color:#96b5b4;">expect</span><span>("</span><span style="color:#a3be8c;">Failed to serialize orders</span><span>")
</span><span> }
</span><span> });
</span><span>
</span><span> server.</span><span style="color:#96b5b4;">listen</span><span>("</span><span style="color:#a3be8c;">127.0.0.1:6767</span><span>");
</span><span>}
</span></code></pre>
<p>We can now do <code>cargo run</code> and make a curl request in another window see our json response:</p>
<pre style="background-color:#2b303b;color:#c0c5ce;"><code><span>$ cargo run
</span><span> Running `target/debug/orders`
</span><span>Listening on http://127.0.0.1:6767
</span><span>Ctrl-C to shutdown server
</span></code></pre>
<pre style="background-color:#2b303b;color:#c0c5ce;"><code><span>$ curl --silent localhost:6767/orders | python -mjson.tool
</span><span>[
</span><span> {
</span><span> "currency": "USD",
</span><span> "id": 123,
</span><span> "status": "shipped",
</span><span> "total": 30.0
</span><span> },
</span><span> {
</span><span> "currency": "USD",
</span><span> "id": 124,
</span><span> "status": "processing",
</span><span> "total": 20.0
</span><span> }
</span><span>]
</span></code></pre>
<h3 id="fixing-the-sync-error"><a class="zola-anchor" href="#fixing-the-sync-error" aria-label="Anchor link for: fixing-the-sync-error">Fixing the Sync Error</a></h3>
<p>Now that we have a functioning webservice that connects to a Postgres database, let us stop and consider our approach. Making a connection per request may be fine for a database like MySQL, where connections are <a href="http://stackoverflow.com/a/99565/775246">stateful and cheap to create</a>, but not recommended for Postgres. We need to create a pool of connections that can be shared across the many different requests. Luckily for us, the creator of the postgres create also created a connection pool called <a href="https://github.com/sfackler/r2d2">r2d2</a> with a Postgres specific <a href="https://github.com/sfackler/r2d2-postgres">adapter</a>. The connection pool internally uses a <a href="https://doc.rust-lang.org/std/sync/struct.Mutex.html">Mutex</a>, which implements <a href="https://doc.rust-lang.org/std/marker/trait.Sync.html">Sync</a>, allowing the connections to be shared across threads.</p>
<p>We also need to consider how we are passing our connection pool to the request. The <code>/orders</code> route is implemented using a <a href="https://doc.rust-lang.org/book/closures.html#move-closures">move closure</a>, which will take ownership of the connection pool once we try to use it. If we create another route and try to use the connection pool, the compiler will throw an error because we now have two closures trying to take ownership of the same value. We need to take advantage of nickel middlware in order to properly share the connection pool. The nickel framework already provides <a href="https://github.com/nickel-org/nickel-postgres">nickel-postgres</a> middleware for this very use-case.</p>
<h2 id="using-connection-pool-middleware"><a class="zola-anchor" href="#using-connection-pool-middleware" aria-label="Anchor link for: using-connection-pool-middleware">Using Connection Pool Middleware</a></h2>
<p>We need to add three more crates to Cargo.toml. The <code>nickel_postgres</code> crate requires a patch that has not been merged yet, so we are specifying a git revision. If/when the PR is accepted, I will update this section.</p>
<pre style="background-color:#2b303b;color:#c0c5ce;"><code><span>r2d2 = "0.7.0"
</span><span>r2d2_postgres = "0.10.1"
</span><span>nickel_postgres = { git = "https://github.com/hjr3/nickel-postgres", rev = "9c1e21f" }
</span></code></pre>
<p>Once that is done, we need to import those three crates and then start specifying what parts of those crates we are going to use. The <code>r2d2_postgres</code> crate has a <code>PostgresConnectionmanager</code> that wraps the standard <code>Connection</code> struct provided by the <code>postgres</code> crate. The <code>r2d2_postgres</code> crate also provides a different <code>SslMode</code> enum (I am not sure why?), so we need to use that instead. This means we can get rid of the explicit postgres crate dependency and we can remove <code>postgres = "0.11.7"</code> from our Cargo.toml file.</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span>#[</span><span style="color:#bf616a;">macro_use</span><span>] </span><span style="color:#b48ead;">extern crate</span><span> nickel;
</span><span style="color:#b48ead;">extern crate</span><span> rustc_serialize;
</span><span style="color:#b48ead;">extern crate</span><span> r2d2;
</span><span style="color:#b48ead;">extern crate</span><span> r2d2_postgres;
</span><span style="color:#b48ead;">extern crate</span><span> nickel_postgres;
</span><span>
</span><span style="color:#b48ead;">use </span><span>nickel::{Nickel, MediaType};
</span><span style="color:#b48ead;">use </span><span>rustc_serialize::json;
</span><span style="color:#b48ead;">use </span><span>r2d2::{Config, Pool};
</span><span style="color:#b48ead;">use </span><span>r2d2_postgres::{PostgresConnectionManager, SslMode};
</span><span style="color:#b48ead;">use </span><span>nickel_postgres::{PostgresMiddleware, PostgresRequestExtensions};
</span></code></pre>
<p>We will be passing the <code>PostgresConnectionManager</code> into a <code>Pool</code> provided by the <code>r2d2</code> crate. The <code>Pool</code> manages all of the complexity around sharing a fixed number of database connections across different threads. The <code>PostgresConnectionManager</code> implements the correct trait so the <code>Pool</code> can interact with Postgres connections. The <code>Pool</code> also accepts a <code>Config</code> struct that configures how the <code>Pool</code> will work. I chose to use the default settings, but you can customize it if you want a different number of connections.</p>
<p>Now that we have our connection pool setup, we need to create the middleware. The <code>PostgresMiddleware</code> struct abstracts away all the details of how the middleware works. We only need to create the middleware and pass on our connection pool. You will also notice that we use <code>PostgresRequestExtensions</code> from <code>nickel_postgres</code>. This is a trait that makes it easier for us to get a connection from the pool when inside of our request.</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">main</span><span>() {
</span><span> </span><span style="color:#b48ead;">let</span><span> db_url = "</span><span style="color:#a3be8c;">postgresql://myapp:dbpass@localhost:15432/myapp</span><span>";
</span><span> </span><span style="color:#b48ead;">let</span><span> db_mgr = PostgresConnectionManager::new(db_url, SslMode::None)
</span><span> .</span><span style="color:#96b5b4;">expect</span><span>("</span><span style="color:#a3be8c;">Unable to connect to database</span><span>");
</span><span>
</span><span> </span><span style="color:#b48ead;">let</span><span> db_pool = Pool::new(Config::default(), db_mgr)
</span><span> .</span><span style="color:#96b5b4;">expect</span><span>("</span><span style="color:#a3be8c;">Unable to initialize connection pool</span><span>");
</span><span>
</span><span> </span><span style="color:#b48ead;">let mut</span><span> server = Nickel::new();
</span><span> server.</span><span style="color:#96b5b4;">utilize</span><span>(PostgresMiddleware::new(db_pool));
</span><span>
</span><span> </span><span style="color:#65737e;">// ...
</span><span>}
</span></code></pre>
<p>When each request comes in, the middleware will put a reference to the connection pool on the request object. We can use <code>request.db_conn()</code>, made possible by the <code>PostgresRequestExtensions</code> trait, to get a database connection from the pool. Now we can use that connection just like we were before. Once our request goes out of scope, the connection will automatically be returned to the pool.</p>
<h3 id="tldr"><a class="zola-anchor" href="#tldr" aria-label="Anchor link for: tldr">TL;DR</a></h3>
<p>Here is our finished product:</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span>#[</span><span style="color:#bf616a;">macro_use</span><span>] </span><span style="color:#b48ead;">extern crate</span><span> nickel;
</span><span style="color:#b48ead;">extern crate</span><span> rustc_serialize;
</span><span style="color:#b48ead;">extern crate</span><span> r2d2;
</span><span style="color:#b48ead;">extern crate</span><span> r2d2_postgres;
</span><span style="color:#b48ead;">extern crate</span><span> nickel_postgres;
</span><span>
</span><span style="color:#b48ead;">use </span><span>nickel::{Nickel, MediaType};
</span><span style="color:#b48ead;">use </span><span>rustc_serialize::json;
</span><span style="color:#b48ead;">use </span><span>r2d2::{Config, Pool};
</span><span style="color:#b48ead;">use </span><span>r2d2_postgres::{PostgresConnectionManager, SslMode};
</span><span style="color:#b48ead;">use </span><span>nickel_postgres::{PostgresMiddleware, PostgresRequestExtensions};
</span><span>
</span><span>#[</span><span style="color:#bf616a;">derive</span><span>(RustcEncodable)]
</span><span style="color:#b48ead;">struct </span><span>Order {
</span><span> </span><span style="color:#bf616a;">id</span><span>: </span><span style="color:#b48ead;">i32</span><span>,
</span><span> </span><span style="color:#bf616a;">total</span><span>: </span><span style="color:#b48ead;">f64</span><span>,
</span><span> </span><span style="color:#bf616a;">currency</span><span>: String,
</span><span> </span><span style="color:#bf616a;">status</span><span>: String,
</span><span>}
</span><span>
</span><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">main</span><span>() {
</span><span>
</span><span> </span><span style="color:#b48ead;">let</span><span> db_url = "</span><span style="color:#a3be8c;">postgresql://myapp:dbpass@localhost:15432/myapp</span><span>";
</span><span> </span><span style="color:#b48ead;">let</span><span> db_mgr = PostgresConnectionManager::new(db_url, SslMode::None)
</span><span> .</span><span style="color:#96b5b4;">expect</span><span>("</span><span style="color:#a3be8c;">Unable to connect to database</span><span>");
</span><span>
</span><span> </span><span style="color:#b48ead;">let</span><span> db_pool = Pool::new(Config::default(), db_mgr)
</span><span> .</span><span style="color:#96b5b4;">expect</span><span>("</span><span style="color:#a3be8c;">Unable to initialize connection pool</span><span>");
</span><span>
</span><span> </span><span style="color:#b48ead;">let mut</span><span> server = Nickel::new();
</span><span> server.</span><span style="color:#96b5b4;">utilize</span><span>(PostgresMiddleware::new(db_pool));
</span><span>
</span><span> server.</span><span style="color:#96b5b4;">utilize</span><span>(router! {
</span><span> get "</span><span style="color:#a3be8c;">/orders</span><span>" => |</span><span style="color:#bf616a;">request</span><span>, </span><span style="color:#b48ead;">mut </span><span style="color:#bf616a;">response</span><span>| {
</span><span> </span><span style="color:#b48ead;">let</span><span> query = "</span><span style="color:#a3be8c;">SELECT id, total, currency, status FROM orders</span><span>";
</span><span> </span><span style="color:#b48ead;">let mut</span><span> orders = Vec::new();
</span><span> </span><span style="color:#b48ead;">let</span><span> db = request.</span><span style="color:#96b5b4;">db_conn</span><span>().</span><span style="color:#96b5b4;">expect</span><span>("</span><span style="color:#a3be8c;">Failed to get a connection from pool</span><span>");
</span><span> </span><span style="color:#b48ead;">for</span><span> row in &db.</span><span style="color:#96b5b4;">query</span><span>(query, &[]).</span><span style="color:#96b5b4;">expect</span><span>("</span><span style="color:#a3be8c;">Failed to select orders</span><span>") {
</span><span> </span><span style="color:#b48ead;">let</span><span> order = Order {
</span><span> id: row.</span><span style="color:#96b5b4;">get</span><span>(</span><span style="color:#d08770;">0</span><span>),
</span><span> total: row.</span><span style="color:#96b5b4;">get</span><span>(</span><span style="color:#d08770;">1</span><span>),
</span><span> currency: row.</span><span style="color:#96b5b4;">get</span><span>(</span><span style="color:#d08770;">2</span><span>),
</span><span> status: row.</span><span style="color:#96b5b4;">get</span><span>(</span><span style="color:#d08770;">3</span><span>),
</span><span> };
</span><span>
</span><span> orders.</span><span style="color:#96b5b4;">push</span><span>(order);
</span><span> }
</span><span>
</span><span> response.</span><span style="color:#96b5b4;">set</span><span>(MediaType::Json);
</span><span> json::encode(&orders).</span><span style="color:#96b5b4;">expect</span><span>("</span><span style="color:#a3be8c;">Failed to serialize orders</span><span>")
</span><span> }
</span><span> });
</span><span>
</span><span> server.</span><span style="color:#96b5b4;">listen</span><span>("</span><span style="color:#a3be8c;">127.0.0.1:6767</span><span>");
</span><span>}
</span></code></pre>
<p>It was a bit of a journey, but we now have a webservice that can properly make requests to a Postgres database and return the result as a json response. Our first attempt ran into a compiler issue when <code>Connection</code> did not implement <code>Sync</code>. We had to modify our orginal approach to fit within the rules that the Rust compiler enforces. That, briefly, meant creating a database connection per request. Realizing this approach was not recommended, we refactored our webservice to use a connection pool that provided thread safety. We also decided to use nickel middlware to expose the connection pool to each request. It added a bit more complexity to our code, but the tradeoff is that we are now guaranteed to be free of data races when serving requests on different threads. You can find the complete working example on github at <a href="https://github.com/hjr3/webservice-demo-rs/tree/blog-post-2">https://github.com/hjr3/webservice-demo-rs/tree/blog-post-2</a>.</p>
http://activitystrea.ms/schema/1.0/postCreating a basic webservice in Rust2016-05-16T00:00:00+00:002016-05-16T00:00:00+00:00Herman J. Radtke IIIhttp://activitystrea.ms/schema/1.0/personhttps://hermanradtke.com/about/https://hermanradtke.com/2016/05/16/creating-a-basic-webservice-in-rust.html/<p>In this post I am going to walk through the creation of a webservice in Rust. This is a <em>Getting Started</em> post that will serve as a foundation for future posts. The webservice will return a static json response to start. There are a few different options for web frameworks in Rust, but practically all of them use the underlying HTTP library called <a href="https://crates.io/crates/hyper">hyper</a>. I am most familiar with <a href="http://nickel.rs/">nickel</a>, so we will be using that. Once the code is complete, we will be creating a release build that is a completely static (standalone) binary. We will then be able to deploy this binary on any modern Linux distro, including Ubuntu and Alpine Linux.</p>
<p>Before we get into any real code, I want to document the environment I am using so you can follow along. I am using a MacBook Air with OS X version 10.11.4. I installed Rust using <a href="https://www.rustup.rs/">rustup.rs</a> and am using the current stable Rust version 1.8.0. At the time of this writing, rustup is in beta. However, it is quite stable and will soon be the official way to install Rust. I will not go into detail on how to install rustup. Please see the official documentation for that. Finally, I will be using a docker container to build a static binary using musl. I will be doing all development on my laptop and only using the docker container and musl to create a <em>release</em> build.</p>
<p>Mac OS X information:</p>
<pre style="background-color:#2b303b;color:#c0c5ce;"><code><span>$ sw_vers
</span><span>ProductName: Mac OS X
</span><span>ProductVersion: 10.11.4
</span><span>BuildVersion: 15E65
</span></code></pre>
<p>Rust version (and toolchain):</p>
<pre style="background-color:#2b303b;color:#c0c5ce;"><code><span>$ rustup show
</span><span>stable-x86_64-apple-darwin (default)
</span><span>rustc 1.8.0 (db2939409 2016-04-11)
</span></code></pre>
<p>Docker version:</p>
<pre style="background-color:#2b303b;color:#c0c5ce;"><code><span>$ docker -v
</span><span>Docker version 1.11.1, build 5604cbe
</span></code></pre>
<p>I mentioned earlier that we are going to using a web framework that is based on the hyper crate. The hyper crate supports TLS/SSL using the OpenSSL library. Unfortunately, Mac OS X 10.11 switched to using LibreSSL instead. In order to compile our webservice, we need to first install OpenSSL. While annoying, this is trivially easy using homebrew.</p>
<pre style="background-color:#2b303b;color:#c0c5ce;"><code><span>$ brew install openssl
</span><span>$ brew link --force openssl
</span><span>$ openssl version
</span><span>OpenSSL 0.9.8zh 14 Jan 2016
</span></code></pre>
<p>Note: If you really do not want to install OpenSSL, you can build <em>debug</em> versions on the docker container too. Build times will be slow and testing/debugging will be significantly more difficult.</p>
<p>Now we can finally get to coding. Let us start by creating a new project using Cargo.</p>
<pre style="background-color:#2b303b;color:#c0c5ce;"><code><span>$ cargo new --bin demo
</span><span>$ cd demo
</span><span>$ cargo build
</span><span>Compiling demo v0.1.0 (file:///Users/herman/projects/demo)
</span></code></pre>
<p>Now we need to add the <a href="https://crates.io/crates/nickel">nickel</a> web framework so we can accept HTTP requests and deliver responses. We can be a little less conservative than crates.io when specifying my dependencies. None of these crates are 1.0 yet, but they are being developed and maintained. We do not want to wildcard (<code>"*"</code>) the entire version number for each dependency as that will leave our webservice suscepitble to backwards compatibility breaks in the future. Leaving the <em>patch version</em> a wildcard will allow us to update the libraries when they have bug fixes in the future without risking a major backwards compatibility break. <strong>Edit: As was explained to me, cargo will update to the latest compatible version (in accordance with SemVer) by default. So <code>0.8.1</code>, <code>0.8.*</code> and <code>^0.8.1</code> all mean the same thing. The standard convention is to specify the version shown to you on crates.io. I have updated the below <code>Cargo.toml</code> example.</strong></p>
<pre style="background-color:#2b303b;color:#c0c5ce;"><code><span>$ cat Cargo.toml
</span><span>[package]
</span><span>name = "demo"
</span><span>version = "0.1.0"
</span><span>authors = ["Your Name <yourname@example.com>"]
</span><span>
</span><span>[dependencies]
</span><span>nickel = "0.8.1"
</span><span>$ cargo build
</span><span> Updating registry `https://github.com/rust-lang/crates.io-index`
</span><span> Compiling regex-syntax v0.3.1
</span><span> ...
</span><span> Compiling demo v0.1.0 (file:///Users/herman/projects/demo)
</span></code></pre>
<p>When we <code>cargo build</code>, we will download and compile 45 total crates into our webservice. This will also create a <a href="https://doc.rust-lang.org/book/getting-started.html#what-is-that-cargolock">Cargo.lock</a> file that contains exact information about our dependencies. Since we are building a binary/executable, we will commit the lock file with the rest of our code.</p>
<p>Now we can open up <code>src/main.rs</code> and start creating our service. In this first step, I will show how to write a handler for an HTTP GET request that generates a very simple json response.</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span>#[</span><span style="color:#bf616a;">macro_use</span><span>] </span><span style="color:#b48ead;">extern crate</span><span> nickel;
</span><span>
</span><span style="color:#b48ead;">use </span><span>nickel::{Nickel, MediaType};
</span><span>
</span><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">main</span><span>() {
</span><span>
</span><span> </span><span style="color:#b48ead;">let mut</span><span> server = Nickel::new();
</span><span>
</span><span> server.</span><span style="color:#96b5b4;">utilize</span><span>(router! {
</span><span> get "</span><span style="color:#a3be8c;">/foo</span><span>" => |</span><span style="color:#bf616a;">_request</span><span>, </span><span style="color:#b48ead;">mut </span><span style="color:#bf616a;">response</span><span>| {
</span><span> response.</span><span style="color:#96b5b4;">set</span><span>(MediaType::Json);
</span><span> </span><span style="color:#b48ead;">r</span><span>#"</span><span style="color:#a3be8c;">{ "foo": "bar" }</span><span>"#
</span><span> }
</span><span> });
</span><span>
</span><span> server.</span><span style="color:#96b5b4;">listen</span><span>("</span><span style="color:#a3be8c;">127.0.0.1:6767</span><span>");
</span><span>}
</span></code></pre>
<p>Let us walk through the above code. We start by importing the nickel crate. The nickel crate includes macros and in order to use them we need to put <code>#[macro_use]</code> in front of the import statement. We then need to specify which parts of the nickel crate we want to use. It is valid to put <code>use nickel::*</code>, but I prefer to be explicit about which parts of a crate I am using.</p>
<p>Our main function is the entry point of our program. We create a new server object, declare what routes we want to handle and then start listening for requests. The design of nickel is similar to the <a href="http://expressjs.com/">Express</a> node.js framework. The <code>server.utilize</code> function is used to register middleware with the server object. Using the <code>router!</code> macro, we can specify each route we want to handle. To start, we want to accept an HTTP GET request for <code>/foo</code>. We can now specify how we want to handle that request inside of a lambda function.</p>
<p>The lambda function includes <code>request</code> and <code>response</code> parameters. Since we are not looking at any information from the request, we prepend it with an underbar (<code>_</code>) so the compiler does not throw a warning. We will be modifying the response to set the media type, so we prepend that with <code>mut</code>. We then set the media type for the response as json and create a literal json string to return. The <code>r#...#</code> syntax is the <a href="https://doc.rust-lang.org/reference.html#raw-string-literals">raw literal string notation</a> in Rust.</p>
<p>We now have a working webservice. Let us see it in action.</p>
<p>Start the server:</p>
<pre style="background-color:#2b303b;color:#c0c5ce;"><code><span>$cargo run
</span><span> Compiling demo v0.1.0 (file:///Users/herman/projects/demo)
</span><span> Running `target/debug/demo`
</span><span>Listening on http://127.0.0.1:6767
</span><span>Ctrl-C to shutdown server
</span></code></pre>
<p>Make a request in another terminal:</p>
<pre style="background-color:#2b303b;color:#c0c5ce;"><code><span>$ curl --silent localhost:6767/foo | python -mjson.tool
</span><span>{
</span><span> "foo": "bar"
</span><span>}
</span></code></pre>
<p>The last thing we will do is create a static binary that can run on a modern Linux server. This is known as <a href="http://blog.rust-lang.org/2016/05/13/rustup.html">cross-compiling</a> and it is becoming a first-class feature of the Rust ecosystem. By default, compiling a Rust program to run on Linux has a few dynamic dependencies. There are many pros and cons to the <em>static vs dynamic</em> debate, but in this example I want to make the webservice completely static so I can deploy it without relying on the presence of any dynamic libraries. Mac OS X uses clang instead of gcc. In order to use musl, we will need gcc. I am going to use a docker container rather than install gcc. At the time of this writing, I am using Docker for Mac (beta). It should not matter how you have docker running on OS X though. I am going to use the <a href="https://github.com/emk/rust-musl-builder">rust-musl-builder</a> docker container, which was built specifically for this purpose.</p>
<p>If you do not have the container installed, running the below command will first pull that container. Annoyingly, it will not execute the command after it downloads the container.</p>
<pre style="background-color:#2b303b;color:#c0c5ce;"><code><span>$ docker run --rm -it -v "$(pwd)":/home/rust/src ekidd/rust-musl-builder cargo build --release
</span><span>Unable to find image 'ekidd/rust-musl-builder:latest' locally
</span><span>latest: Pulling from ekidd/rust-musl-builder
</span><span>...
</span><span>Digest: sha256:0c2e9357d1cff9fc9c37396953749ca601fe4d3ee1b47104cd46d99a1a90f576
</span><span>Status: Downloaded newer image for ekidd/rust-musl-builder:latest
</span><span>$ docker run --rm -it -v "$(pwd)":/home/rust/src ekidd/rust-musl-builder cargo build --release
</span><span> Updating registry `https://github.com/rust-lang/crates.io-index`
</span><span> Downloading nickel v0.8.1
</span><span> ...
</span><span> Compiling utf8-ranges v0.1.3
</span><span> ...
</span><span> Compiling demo v0.1.0 (file:///home/rust/src)
</span><span>$ ls -lah target/x86_64-unknown-linux-musl/release/demo
</span><span>-rwxr-xr-x 1 herman staff 2.5M May 16 08:34 target/x86_64-unknown-linux-musl/release/demo
</span></code></pre>
<p>Now that we have the container, we can create the release build. It is going to take a bit of time. Rust has to download all the dependencies on that container, compile each of them and then compile our main program. We also specified the <code>--release</code> flag so Rust is optimizing each step of the build process.</p>
<p>We now have a 2.5MB statically compiled executable. We can run our webservice on any modern Linux distro just by copying the file there and running it. There is still a lot more to do to make a <em>production ready</em> webservice, but this is the basic foundation that we will refer back to when making future improvements. You can find the complete working example on github at <a href="https://github.com/hjr3/webservice-demo-rs/tree/blog-post-1">https://github.com/hjr3/webservice-demo-rs</a>.</p>
http://activitystrea.ms/schema/1.0/postWorking with C unions in Rust FFI2016-03-17T00:00:00+00:002016-03-17T00:00:00+00:00Herman J. Radtke IIIhttp://activitystrea.ms/schema/1.0/personhttps://hermanradtke.com/about/https://hermanradtke.com/2016/03/17/unions-rust-ffi.html/<p>When building a foreign function interface to C code, we will inevitably run into a struct that has a union. Rust has no built-in support for unions, so we must come up with a strategy on our own. A union is a type in C that stores different data types in the same memory location. There are a number of reasons why someone may want to choose a union, including: converting between binary representations of integers and floats, implementing pseudo-polymorphism and direct access to bits. I am going to focus on the pseudo-polymorphism case.</p>
<p>Edit: Added a <a href="https://hermanradtke.com/2016/03/17/unions-rust-ffi.html/#warning">warning</a> at the bottom based on feedback from <a href="https://twitter.com/jckarter/status/710875695539310592">Joe Groff</a>.</p>
<p>Note: This post assumes the reader is familiar with <a href="https://doc.rust-lang.org/book/ffi.html">Rust FFI</a>, <a href="https://en.wikipedia.org/wiki/Endianness">endianess</a> and <a href="https://en.wikipedia.org/wiki/Ioctl">ioctl</a>.</p>
<p>As an example, let us get the MAC address based on an interface name. We can summarize the steps to get the MAC address as follows:</p>
<ul>
<li>Specify a request type to be used with <code>ioctl</code>. If I want to get the MAC (or hardware) address, I specify <code>SIOCGIFHWADDR</code>.</li>
<li>Write the interface name to <code>ifr_name</code>. An interface name is something like <code>eth0</code>.</li>
<li>Make the request using <code>ioctl</code>. A successful request will write some data to <code>ifr_ifru</code>.</li>
</ul>
<p>For more details on how to get a MAC address, read this <a href="http://www.microhowto.info/howto/get_the_mac_address_of_an_ethernet_interface_in_c_using_siocgifhwaddr.html">howto</a>.</p>
<p>We need to use the C <code>ioctl</code> function and also pass the <code>ifreq</code> struct to the function. Looking in <code>/usr/include/net/if.h</code>, we can see that <code>ifreq</code> is defined as follows:</p>
<pre data-lang="C" style="background-color:#2b303b;color:#c0c5ce;" class="language-C "><code class="language-C" data-lang="C"><span style="color:#b48ead;">struct </span><span>ifreq {
</span><span> </span><span style="color:#b48ead;">char</span><span> ifr_name[IFNAMSIZ];
</span><span> </span><span style="color:#b48ead;">union </span><span>{
</span><span> </span><span style="color:#b48ead;">struct</span><span> sockaddr ifru_addr;
</span><span> </span><span style="color:#b48ead;">struct</span><span> sockaddr ifru_dstaddr;
</span><span> </span><span style="color:#b48ead;">struct</span><span> sockaddr ifru_broadaddr;
</span><span> </span><span style="color:#b48ead;">short</span><span> ifru_flags;
</span><span> </span><span style="color:#b48ead;">int</span><span> ifru_metric;
</span><span> </span><span style="color:#b48ead;">int</span><span> ifru_mtu;
</span><span> </span><span style="color:#b48ead;">int</span><span> ifru_phys;
</span><span> </span><span style="color:#b48ead;">int</span><span> ifru_media;
</span><span> </span><span style="color:#b48ead;">int</span><span> ifru_intval;
</span><span> caddr_t ifru_data;
</span><span> </span><span style="color:#b48ead;">struct</span><span> ifdevmtu ifru_devmtu;
</span><span> </span><span style="color:#b48ead;">struct</span><span> ifkpi ifru_kpi;
</span><span> u_int32_t ifru_wake_flags;
</span><span> u_int32_t ifru_route_refcnt;
</span><span> </span><span style="color:#b48ead;">int</span><span> ifru_cap[</span><span style="color:#d08770;">2</span><span>];
</span><span> } ifr_ifru;
</span><span>}
</span></code></pre>
<p>The <code>ifr_ifru</code> union is where things start to get tricky. Glancing at the possible types in <code>ifr_ifru</code>, we notice that they are not all the same size. A <code>short</code> is 2 bytes and <code>u_int32_t</code> is 4 bytes. To complicate matters, we have a number of different struct definitions of unknown size. It is important that we figure out exactly what the size of the <code>ifreq</code> struct so we can write the proper Rust code. I wrote a small C program and figured out that <code>ifreq</code> uses 16 bytes for <code>ifr_name</code> and 24 bytes for <code>ifr_ifru</code>.</p>
<p>Armed with the knowledge of how large teh struct is, we can start representing this in Rust. One strategy is to make a specialized struct for each type in the union.</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span>#[</span><span style="color:#bf616a;">repr</span><span>(C)]
</span><span style="color:#b48ead;">pub struct </span><span>IfReqShort {
</span><span> </span><span style="color:#bf616a;">ifr_name</span><span>: [c_char; 16],
</span><span> </span><span style="color:#bf616a;">ifru_flags</span><span>: c_short,
</span><span>}
</span></code></pre>
<p>We can use <code>IfReqShort</code> when making a request of type <code>SIOCGIFINDEX</code>. This struct is smaller than the <code>ifreq</code> struct in C though. Even though we are expecting only 2 bytes to be written, the external ioctl interface expects there to be a total of 24 bytes. To be safe, let us add 22 bytes of padding at the end:</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span>#[</span><span style="color:#bf616a;">repr</span><span>(C)]
</span><span style="color:#b48ead;">pub struct </span><span>IfReqShort {
</span><span> </span><span style="color:#bf616a;">ifr_name</span><span>: [c_char; 16],
</span><span> </span><span style="color:#bf616a;">ifru_flags</span><span>: c_short,
</span><span> </span><span style="color:#bf616a;">_padding</span><span>: [</span><span style="color:#b48ead;">u8</span><span>; 22],
</span><span>}
</span></code></pre>
<p>We would then repeat this process for each type in the union. I find this a bit tedious to do as we need to make a lot of structs and be very careful to make them the correct size. Another way to represent the union is to have a buffer of raw bytes. We can make a single C representation of <code>ifreq</code> in Rust like this:</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span>#[</span><span style="color:#bf616a;">repr</span><span>(C)]
</span><span style="color:#b48ead;">pub struct </span><span>IfReq {
</span><span> </span><span style="color:#bf616a;">ifr_name</span><span>: [c_char; 16],
</span><span> </span><span style="color:#bf616a;">union</span><span>: [</span><span style="color:#b48ead;">u8</span><span>; 24],
</span><span>}
</span></code></pre>
<p>This <code>union</code> buffer can store the raw bytes for any type. We can now define methods to convert the raw bytes into the type we want. We will avoid unsafe code by not using transmute. Let us create a method to get the MAC address by converting the raw bytes in a <code>sockaddr</code> C type.</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">impl </span><span>IfReq {
</span><span> </span><span style="color:#b48ead;">pub fn </span><span style="color:#8fa1b3;">ifr_hwaddr</span><span>(&</span><span style="color:#bf616a;">self</span><span>) -> sockaddr {
</span><span> </span><span style="color:#b48ead;">let mut</span><span> s = sockaddr {
</span><span> sa_family: </span><span style="color:#b48ead;">u16</span><span>::from_be((</span><span style="color:#bf616a;">self</span><span>.data[</span><span style="color:#d08770;">0</span><span>] as </span><span style="color:#b48ead;">u16</span><span>) << </span><span style="color:#d08770;">8 </span><span>| (</span><span style="color:#bf616a;">self</span><span>.data[</span><span style="color:#d08770;">1</span><span>] as </span><span style="color:#b48ead;">u16</span><span>)),
</span><span> sa_data: [</span><span style="color:#d08770;">0</span><span>; </span><span style="color:#d08770;">14</span><span>],
</span><span> };
</span><span>
</span><span> </span><span style="color:#65737e;">// basically a memcpy
</span><span> </span><span style="color:#b48ead;">for </span><span>(i, b) in </span><span style="color:#bf616a;">self</span><span>.data[</span><span style="color:#d08770;">2</span><span>..</span><span style="color:#d08770;">16</span><span>].</span><span style="color:#96b5b4;">iter</span><span>().</span><span style="color:#96b5b4;">enumerate</span><span>() {
</span><span> s.sa_data[i] = *b as </span><span style="color:#b48ead;">i8</span><span>;
</span><span> }
</span><span>
</span><span> s
</span><span> }
</span><span>}
</span></code></pre>
<p>With this strategy, we have one struct and a method to convert the raw bytes into the concrete type that we want. Looking back at our <code>ifr_ifru</code> union, we will notice that there are at least two others requests that will also require me to create a <code>sockaddr</code> from raw bytes. To <em>DRY</em> this up, we could implement a private method on <code>IfReq</code> to convert raw bytes to <code>sockaddr</code>. However, we can do better by abstracting away the details of creating a <code>sockaddr</code>, <code>short</code>, <code>int</code>, etc from <code>IfReq</code>. We really just want to <em>tell</em> the union to give me back a specified type. So, let us make a <code>IfReqUnion</code> type to do that:</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span>#[</span><span style="color:#bf616a;">repr</span><span>(C)]
</span><span style="color:#b48ead;">struct </span><span>IfReqUnion {
</span><span> </span><span style="color:#bf616a;">data</span><span>: [</span><span style="color:#b48ead;">u8</span><span>; 24],
</span><span>}
</span><span>
</span><span style="color:#b48ead;">impl </span><span>IfReqUnion {
</span><span> </span><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">as_sockaddr</span><span>(&</span><span style="color:#bf616a;">self</span><span>) -> sockaddr {
</span><span> </span><span style="color:#b48ead;">let mut</span><span> s = sockaddr {
</span><span> sa_family: </span><span style="color:#b48ead;">u16</span><span>::from_be((</span><span style="color:#bf616a;">self</span><span>.data[</span><span style="color:#d08770;">0</span><span>] as </span><span style="color:#b48ead;">u16</span><span>) << </span><span style="color:#d08770;">8 </span><span>| (</span><span style="color:#bf616a;">self</span><span>.data[</span><span style="color:#d08770;">1</span><span>] as </span><span style="color:#b48ead;">u16</span><span>)),
</span><span> sa_data: [</span><span style="color:#d08770;">0</span><span>; </span><span style="color:#d08770;">14</span><span>],
</span><span> };
</span><span>
</span><span> </span><span style="color:#65737e;">// basically a memcpy
</span><span> </span><span style="color:#b48ead;">for </span><span>(i, b) in </span><span style="color:#bf616a;">self</span><span>.data[</span><span style="color:#d08770;">2</span><span>..</span><span style="color:#d08770;">16</span><span>].</span><span style="color:#96b5b4;">iter</span><span>().</span><span style="color:#96b5b4;">enumerate</span><span>() {
</span><span> s.sa_data[i] = *b as </span><span style="color:#b48ead;">i8</span><span>;
</span><span> }
</span><span>
</span><span> s
</span><span> }
</span><span>
</span><span> </span><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">as_int</span><span>(&</span><span style="color:#bf616a;">self</span><span>) -> c_int {
</span><span> </span><span style="color:#b48ead;">c_int</span><span>::from_be((</span><span style="color:#bf616a;">self</span><span>.data[</span><span style="color:#d08770;">0</span><span>] as </span><span style="color:#b48ead;">c_int</span><span>) << </span><span style="color:#d08770;">24 </span><span>|
</span><span> (</span><span style="color:#bf616a;">self</span><span>.data[</span><span style="color:#d08770;">1</span><span>] as </span><span style="color:#b48ead;">c_int</span><span>) << </span><span style="color:#d08770;">16 </span><span>|
</span><span> (</span><span style="color:#bf616a;">self</span><span>.data[</span><span style="color:#d08770;">2</span><span>] as </span><span style="color:#b48ead;">c_int</span><span>) << </span><span style="color:#d08770;">8 </span><span>|
</span><span> (</span><span style="color:#bf616a;">self</span><span>.data[</span><span style="color:#d08770;">3</span><span>] as </span><span style="color:#b48ead;">c_int</span><span>))
</span><span> }
</span><span>
</span><span> </span><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">as_short</span><span>(&</span><span style="color:#bf616a;">self</span><span>) -> c_short {
</span><span> </span><span style="color:#b48ead;">c_short</span><span>::from_be((</span><span style="color:#bf616a;">self</span><span>.data[</span><span style="color:#d08770;">0</span><span>] as </span><span style="color:#b48ead;">c_short</span><span>) << </span><span style="color:#d08770;">8 </span><span>|
</span><span> (</span><span style="color:#bf616a;">self</span><span>.data[</span><span style="color:#d08770;">1</span><span>] as </span><span style="color:#b48ead;">c_short</span><span>))
</span><span> }
</span><span>}
</span></code></pre>
<p>We implement methods for each of the various types that make up the union. Now that our type conversions are handled by <code>IfReqUnion</code>, we can now implement the methods on <code>IfReq</code> like this:</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span>#[</span><span style="color:#bf616a;">repr</span><span>(C)]
</span><span style="color:#b48ead;">pub struct </span><span>IfReq {
</span><span> </span><span style="color:#bf616a;">ifr_name</span><span>: [c_char; IFNAMESIZE],
</span><span> </span><span style="color:#bf616a;">union</span><span>: IfReqUnion,
</span><span>}
</span><span>
</span><span style="color:#b48ead;">impl </span><span>IfReq {
</span><span> </span><span style="color:#b48ead;">pub fn </span><span style="color:#8fa1b3;">ifr_hwaddr</span><span>(&</span><span style="color:#bf616a;">self</span><span>) -> sockaddr {
</span><span> </span><span style="color:#bf616a;">self</span><span>.union.</span><span style="color:#96b5b4;">as_sockaddr</span><span>()
</span><span> }
</span><span>
</span><span> </span><span style="color:#b48ead;">pub fn </span><span style="color:#8fa1b3;">ifr_dstaddr</span><span>(&</span><span style="color:#bf616a;">self</span><span>) -> sockaddr {
</span><span> </span><span style="color:#bf616a;">self</span><span>.union.</span><span style="color:#96b5b4;">as_sockaddr</span><span>()
</span><span> }
</span><span>
</span><span> </span><span style="color:#b48ead;">pub fn </span><span style="color:#8fa1b3;">ifr_broadaddr</span><span>(&</span><span style="color:#bf616a;">self</span><span>) -> sockaddr {
</span><span> </span><span style="color:#bf616a;">self</span><span>.union.</span><span style="color:#96b5b4;">as_sockaddr</span><span>()
</span><span> }
</span><span>
</span><span> </span><span style="color:#b48ead;">pub fn </span><span style="color:#8fa1b3;">ifr_ifindex</span><span>(&</span><span style="color:#bf616a;">self</span><span>) -> c_int {
</span><span> </span><span style="color:#bf616a;">self</span><span>.union.</span><span style="color:#96b5b4;">as_int</span><span>()
</span><span> }
</span><span>
</span><span> </span><span style="color:#b48ead;">pub fn </span><span style="color:#8fa1b3;">ifr_media</span><span>(&</span><span style="color:#bf616a;">self</span><span>) -> c_int {
</span><span> </span><span style="color:#bf616a;">self</span><span>.union.</span><span style="color:#96b5b4;">as_int</span><span>()
</span><span> }
</span><span>
</span><span> </span><span style="color:#b48ead;">pub fn </span><span style="color:#8fa1b3;">ifr_flags</span><span>(&</span><span style="color:#bf616a;">self</span><span>) -> c_short {
</span><span> </span><span style="color:#bf616a;">self</span><span>.union.</span><span style="color:#96b5b4;">as_short</span><span>()
</span><span> }
</span><span>}
</span></code></pre>
<p>We ended up with two structs. We have <code>IfReq</code> that represents the memory layout of the C struct <code>ifreq</code>. We will implement a method on <code>IfReq</code> for each type of <code>ioctl</code> request. We also have the <code>IfRequnion</code> struct that handles the various types the <code>ifr_ifru</code> union might be. We will create a method to for each type we need to handle. This is less work than creating a specialized struct for each type in the union and provides a better interface than doing the type conversion in <code>IfReq</code>.</p>
<p>Here is a more complete working <a href="https://github.com/hjr3/carp-rs/blob/5d56a62b1a698949a7252db637d3fbeadbb62e3b/src/mac.rs">example</a>. This is still a bit of a work in progress, but the tests pass and the code incorporates the above concepts discussed.</p>
<h2 id="warning"><a class="zola-anchor" href="#warning" aria-label="Anchor link for: warning">Warning</a></h2>
<p>The above approach is not without problems. In the case of <code>ifreq</code>, we were fortunate that <code>ifr_name</code> was 16 bytes and was aligned on a word boundary. If <code>ifr_name</code> was not aligned to a 4 byte word boundary, then we will run into a problem. Our <code>union</code> type is <code>[u8; 24]</code> which has an alignment of a single byte. This is not the same alignment as a type of size 24 bytes. Here is a short example to illustrate this point. If we have a C struct with the following union:</p>
<pre data-lang="C" style="background-color:#2b303b;color:#c0c5ce;" class="language-C "><code class="language-C" data-lang="C"><span style="color:#b48ead;">struct </span><span>foo {
</span><span> </span><span style="color:#b48ead;">short</span><span> x;
</span><span> </span><span style="color:#b48ead;">union </span><span>{
</span><span> </span><span style="color:#b48ead;">int</span><span>;
</span><span> } y;
</span><span>}
</span></code></pre>
<p>The above <code>foo</code> struct has a size of 8 bytes. Two bytes for <code>x</code>, two more bytes for padding and four bytes for <code>y</code>. If we tried to write this in Rust:</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span>#[</span><span style="color:#bf616a;">repr</span><span>(C)]
</span><span style="color:#b48ead;">pub struct </span><span>Foo {
</span><span> </span><span style="color:#bf616a;">x</span><span>: </span><span style="color:#b48ead;">u16</span><span>,
</span><span> </span><span style="color:#bf616a;">y</span><span>: [</span><span style="color:#b48ead;">u8</span><span>; 4],
</span><span>}
</span></code></pre>
<p>The above <code>Foo</code> struct is only 6 bytes. Two bytes for x and then we can fit the first two <code>u8</code> elements of <code>y</code> in the same 4 byte <em>word</em> as <code>x</code>. This subtle difference may cause problems when being passed to a C function that is expecting a struct of 8 bytes.</p>
<p>Until Rust natively supports unions, this sort of FFI is difficult to get right. Good luck, but be careful!</p>
http://activitystrea.ms/schema/1.0/postExploring the Rust Standard Library2016-01-18T00:00:00+00:002016-01-18T00:00:00+00:00Herman J. Radtke IIIhttp://activitystrea.ms/schema/1.0/personhttps://hermanradtke.com/about/https://hermanradtke.com/2016/01/18/exploring-rust-std-library.html/<p>I was writing some Rust with a colleague when they asked me about the cases where Rust deferences types for us automatically. I said that Rust will <a href="http://stackoverflow.com/a/28552082">automatically dereference pointers when making method calls</a>, but otherwise there was no compiler magic. This conflicted with their experience with Rust and presented an example like this:</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">let</span><span> a = </span><span style="color:#d08770;">1</span><span>;
</span><span style="color:#b48ead;">let</span><span> b = </span><span style="color:#d08770;">2</span><span>;
</span><span style="color:#b48ead;">let</span><span> c = a + &b;
</span></code></pre>
<p>The basic question is: <em>"How does this code work without us dereferencing <code>b</code>?"</em>. I think this is a great question and touches on an aspect of Rust that I really like.</p>
<p>Rust is written in, well, Rust. While I myself do not understand much of the type checking code, I can read and understand the vast majority of the core and standard libraries. Many languages, especially scripted ones, are not like this. Often times the core and standard libraries are written in C. In order to see how this code works you have to not only understand C, but also understand how the data structures in C relate to your language. The barrier to entry is quite high. The net result is that most people rely on documentation, blogs or stack overflow to understand how major parts of the language work. They cannot go see for themselves. I really like that major parts of <em>Rust proper</em> are much more accessible.</p>
<p>Let us go see for ourselves why we can add a type <code>T</code> and a reference to a type <code>&U</code>. Along the way we will learn how to explore some of the inner workings of Rust. To start, we need to bring up the <a href="https://doc.rust-lang.org/std/">std library documentation</a> webpage. We can then search for the word "<a href="https://doc.rust-lang.org/std/?search=add">add</a>" and the first result will be <a href="https://doc.rust-lang.org/std/ops/trait.Add.html">std::ops::Add</a> with a summary descriptipn of <em>The Add trait is used to specify the functionality of +</em>. That seems like a good place to start. We now know that adding two things together is implemented using the <code>Add</code> trait. The webpage for the <code>Add</code> trait even shows us a simple implementation.</p>
<p>Scrolling down the webpage will list all the implementations of the <code>Add</code> trait that exist in the standard library. Looking at 14th item in that list, you will see <code>impl<'a> Add<&'a usize> for usize</code>. The standard library has a specific implementation of the <code>Add</code> trait for the case when the right hand side (rhs) of the addition is a reference to a type. If you scroll down more you will see that all the numeric types are listed. Each numeric type has implementations of the <code>Add</code> trait for <code>T + U</code>, <code>&T + U</code>, <code>T + &U</code> and <code>&T + &U</code>. You can also find similar results for subtraction, multiplication and division.</p>
<p>You will find this pattern repeated over and over in Rust. Some generic functionality is represented as a trait. In order to specify that functionality, that trait must be implemented. It is not uncommon to see long lists of implementations for traits in the standard library. While this may appear to be a lot of boilerplate, the benefit is that Rust can check our code at compile time (the alternative would be to wait until runtime which is less safe and makes our code slower).</p>
<p>If we jump back to the list of implementations for the <code>Add</code> trait you might notice something interesting. The standard library does not specify implementations for addition between different types. This code will not work:</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">let</span><span> a: </span><span style="color:#b48ead;">u64 </span><span>= </span><span style="color:#d08770;">1</span><span>;
</span><span style="color:#b48ead;">let</span><span> b: </span><span style="color:#b48ead;">u32 </span><span>= </span><span style="color:#d08770;">2</span><span>;
</span><span style="color:#b48ead;">let</span><span> c = a + b;
</span></code></pre>
<p>If you ever ran into a compiler error about adding two different numeric types together before and wondered why that does not work, now you know. It is not some compiler magic, but instead the simple fact that the standard library does not list <code>impl Add<u32> for u64</code>.</p>
<p>I mentioned above that we should not have to rely on the documentation to understand how standard features work. So far, we have been relying on the <em>excellent</em> documentation of the Rust standard library. If we scroll back up the webpage, we should see the <a href="https://doc.rust-lang.org/src/core/ops.rs.html#182-190">[src]</a> link to the actual source code for the <code>Add</code> trait. If we follow that link, we will see the definition of the <code>Add</code> trait and then a macro called <code>add_impl!</code> being defined. Macros can be a little hard to understand, but if we can generally understand that this macro defines <code>T + T</code>. Right below that macro we should see something like:</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span>add_impl! { </span><span style="color:#b48ead;">usize u8 u16 u32 u64 isize i8 i16 i32 i64 f32 f64 </span><span>}
</span></code></pre>
<p>Now we see how all the numeric types implement the <code>Add</code> trait for <code>T + T</code>. We need to look a little deeper to understand how references are handled. If we look back at the bottom of the <code>add_impl!</code> macro we will see another macro called <code>forward_ref_binop!</code>. If we scroll up the page we can find the definition for <code>forward_ref_binop!</code> and we will notice that it defines the behavior for <code>&T + U</code>, <code>T + &U</code> and <code>&T + &U</code>. Take note that the use of macros greatly decreased the amount of boilerplate in the Rust standard library. Macros are harder to read, but they are certainly powerful.</p>
<p>I find myself following the above approach when I run into something about Rust I do not understand. This even works for crates listed on crates.io that generate documentation. For example, the <a href="https://crates.io/crates/mio/">mio crate</a> hosts <a href="http://rustdoc.s3-website-us-east-1.amazonaws.com/mio/v0.5.x/mio/">documentation</a> on Amazon S3 but the look, feel and functionality are the same as the official Rust documentation webpages. There are other ancillary benefits to exploring the Rust standard library. Along the way you learn things you were not explicitly looking for. The standard library can also be a great reference for how to implement something. The code is written to a high standard and puts a lot of emphasis on correctness. Reading the core and standard libraries may seem daunting at first, especially if you are not familar with macros, but stick with it. With some practice and patience it will become much more familiar to you. At that point, you can start contributing too!</p>
http://activitystrea.ms/schema/1.0/postManaging Connection State With mio2015-10-23T00:00:00+00:002015-10-23T00:00:00+00:00Herman J. Radtke IIIhttp://activitystrea.ms/schema/1.0/personhttps://hermanradtke.com/about/https://hermanradtke.com/2015/10/23/managing-connection-state-with-mio-rust.html/<p>I would wager most people who choose mio to solve their async IO problems are expecting more abstraction in the library. The oft repeated question of <em>Why doesn't mio have callbacks?</em> is evidence of this. In fact, it is a design goal of mio to add only as much abstraction necessary to provide a consistent API for the various OS async IO implementations. A consequence of this design decision is that there are subtle behavioral differences between platforms. It may be tempting to let mio manage the state of various connections, but I have found that this can have unintended consequences. For example, until recently <a href="https://github.com/carllerche/mio/pull/270">mio internally buffered kqueue events</a>. The way epoll is designed does not warrant buffering events and those of us using kqueue encountered some <a href="https://github.com/carllerche/mio/pull/265">interesting panic behavior on kqueue</a>. I was able to avoid this issue when I started managing connection state in conjunction with the recently added <code>Handler::tick()</code> function.</p>
<p>The idea for adding <code>Handler::tick()</code> came from a discussion abbout <a href="https://github.com/carllerche/mio/issues/219">deregister bevhavior</a>. The idea is that at the end of each event loop <em>tick</em>, the <code>Handler::tick()</code> function will be called. By default this function does nothing. We can implement this function to act as a checkpoint to sync the state of our connections with mio before the start of the next event loop <em>tick</em>. We have three types of state:</p>
<ul>
<li><strong>socket state</strong> - whether the connection is reset or not.</li>
<li><strong>event state</strong> - whether we need to register, reregister or do nothing.</li>
<li><strong>read/write state</strong> - whether we are in the middle of a read/write or not. I discussed a solution to this in my post on <a href="/2015/09/12/creating-a-simple-protocol-when-using-rust-and-mio.html">Creating A Simple Protocol When Using Rust and mio</a>.</li>
</ul>
<h2 id="socket-state"><a class="zola-anchor" href="#socket-state" aria-label="Anchor link for: socket-state">Socket State</a></h2>
<p>Our connection socket can stop working for many different reasons. When this does happen, we need to remove the connection from the connection slab. One straight-forward approach is to immediately remove the connection from the slab when there is an error related to the socket. Keep in mind though that just because we had an error when trying to handle an event does not mean that there is not another event for that same token. If we try to handle that later event by looking up that connection using a token, we will inadvertently panic. We are now forced to try and keep track of whether a token still exists in the slab inside our <code>Handler</code>.</p>
<p>Instead of removing the connection from the slab immediately, we can keep track of whether the connection is reset or not inside the <code>Connection</code> struct. If we encounter an error, we will mark the socket as reset and leave the connection in the connection slab until the event loop tick is finished. Now that we have this information local to the connection, our <code>Handler</code> can check whether or not the connection is reset before trying to dispatch events to it. Finally, when our <code>Handler::tick()</code> method is called, we can check each connection to see if it is reset. If the connection is reset, we can then remove the connection from the slab. Since we did this at the end of the event loop, we can now be confident there are no more spurious events for our token.</p>
<p>Let us implement a simple way to keep track of socket state. The first thing we need to do is add an <code>is_reset: bool</code> variable to our <code>Connection</code> struct. If <code>is_reset</code> is <em>true</em>, then we will remove the connection from our connection slab. We will also create two new functions on our <code>Connection</code>:</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">impl </span><span>Connection {
</span><span> </span><span style="color:#b48ead;">pub fn </span><span style="color:#8fa1b3;">mark_reset</span><span>(&</span><span style="color:#b48ead;">mut </span><span style="color:#bf616a;">self</span><span>) {
</span><span> trace!("</span><span style="color:#a3be8c;">connection mark_reset; token={:?}</span><span>", </span><span style="color:#bf616a;">self</span><span>.token);
</span><span>
</span><span> </span><span style="color:#bf616a;">self</span><span>.is_reset = </span><span style="color:#d08770;">true</span><span>;
</span><span> }
</span><span>
</span><span> #[</span><span style="color:#bf616a;">inline</span><span>]
</span><span> </span><span style="color:#b48ead;">pub fn </span><span style="color:#8fa1b3;">is_reset</span><span>(&</span><span style="color:#bf616a;">self</span><span>) -> </span><span style="color:#b48ead;">bool </span><span>{
</span><span> </span><span style="color:#bf616a;">self</span><span>.is_reset
</span><span> }
</span><span>}
</span></code></pre>
<p>Now the server can quickly determine if a connection has already been reset. If a connection is reset, we want to drop any <em>readable</em> or <em>writeable</em> events. If the connection is not reset, we are confident that we can dispatch an event to that connection. If there is an error when dispatching the event to the connection, then we want to mark that connection as reset.</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">let</span><span> conn = </span><span style="color:#bf616a;">self</span><span>.</span><span style="color:#96b5b4;">find_connection_by_token</span><span>(token);
</span><span>
</span><span style="color:#b48ead;">if</span><span> conn.</span><span style="color:#96b5b4;">is_reset</span><span>() {
</span><span> info!("</span><span style="color:#a3be8c;">{:?} has already been reset</span><span>", token);
</span><span> </span><span style="color:#b48ead;">return</span><span>;
</span><span>}
</span><span>
</span><span>conn.</span><span style="color:#96b5b4;">writable</span><span>().</span><span style="color:#96b5b4;">unwrap_or_else</span><span>(|</span><span style="color:#bf616a;">e</span><span>| {
</span><span> warn!("</span><span style="color:#a3be8c;">Write event failed for {:?}, {:?}</span><span>", token, e);
</span><span> conn.</span><span style="color:#96b5b4;">mark_reset</span><span>();
</span><span>});
</span></code></pre>
<p>At the end of the event loop tick, we can loop through our connections and check if any are reset. If they are, we then remove them from the connection slab. Unfortunately, there is not a real good way to iterate over the slab and remove connections from it. Future changes to the <a href="https://github.com/carllerche/slab">slab crate</a> should make this easier by adding features like <a href="https://doc.rust-lang.org/nightly/collections/vec/struct.Vec.html#method.retain">Vec::retain</a>.</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">tick</span><span>(&</span><span style="color:#b48ead;">mut </span><span style="color:#bf616a;">self</span><span>, </span><span style="color:#bf616a;">event_loop</span><span>: &</span><span style="color:#b48ead;">mut </span><span>EventLoop<Server>) {
</span><span> trace!("</span><span style="color:#a3be8c;">Handling end of tick</span><span>");
</span><span>
</span><span> </span><span style="color:#b48ead;">let mut</span><span> reset_tokens = Vec::new();
</span><span>
</span><span> </span><span style="color:#b48ead;">for</span><span> c in </span><span style="color:#bf616a;">self</span><span>.conns.</span><span style="color:#96b5b4;">iter</span><span>() {
</span><span> </span><span style="color:#b48ead;">if</span><span> c.</span><span style="color:#96b5b4;">is_reset</span><span>() {
</span><span> reset_tokens.</span><span style="color:#96b5b4;">push</span><span>(c.token);
</span><span> }
</span><span> }
</span><span>
</span><span> </span><span style="color:#b48ead;">for</span><span> token in reset_tokens {
</span><span> </span><span style="color:#b48ead;">match </span><span style="color:#bf616a;">self</span><span>.conns.</span><span style="color:#96b5b4;">remove</span><span>(token) {
</span><span> Some(_c) => {
</span><span> debug!("</span><span style="color:#a3be8c;">reset connection; token={:?}</span><span>", token);
</span><span> }
</span><span> None => {
</span><span> warn!("</span><span style="color:#a3be8c;">Unable to remove connection for {:?}</span><span>", token);
</span><span> }
</span><span> }
</span><span> }
</span><span>}
</span></code></pre>
<p>Notice that we do not call the <code>EventLoop::deregister()</code> method when a connection is removed from the slab. When we remove a connection from the slab, mio will internally deregister the connection so no more events will be sent. If we call deregister too early, some async I/O implementations (such as kqueue) will send that event as <code>Token(0)</code>.</p>
<h2 id="event-state"><a class="zola-anchor" href="#event-state" aria-label="Anchor link for: event-state">Event State</a></h2>
<p>When I started using mio, I put calls to rereregister <a href="https://github.com/hjr3/mob/blob/multi-echo-blog-post/src/main.rs">all over the place</a>. I found a couple of problems with this approach. The first problem is that it becomes increasingly difficult to keep track of when connections are getting added to or removed from the event loop. The second problem is that any spurious event has a good chance of causing a panic. Remember, this is asynchronous behavior and our mental model is often incorrect. I believe the best strategy is to handle all registration related activities inside of <code>Handler::tick()</code>. We can make it a goal not to reregister a connection more than once per event loop tick. We should also make it a goal not to reregister if the connection has not received an event.</p>
<p>Similar to our strategy with tracking socket state, we can add an <code>is_idle: bool</code> to our <code>Connection</code> struct. We will also add two similar functions:</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">impl </span><span>Connection {
</span><span> </span><span style="color:#b48ead;">pub fn </span><span style="color:#8fa1b3;">mark_idle</span><span>(&</span><span style="color:#b48ead;">mut </span><span style="color:#bf616a;">self</span><span>) {
</span><span> trace!("</span><span style="color:#a3be8c;">connection mark_idle; token={:?}</span><span>", </span><span style="color:#bf616a;">self</span><span>.token);
</span><span>
</span><span> </span><span style="color:#bf616a;">self</span><span>.is_idle = </span><span style="color:#d08770;">true</span><span>;
</span><span> }
</span><span>
</span><span> #[</span><span style="color:#bf616a;">inline</span><span>]
</span><span> </span><span style="color:#b48ead;">pub fn </span><span style="color:#8fa1b3;">is_idle</span><span>(&</span><span style="color:#bf616a;">self</span><span>) -> </span><span style="color:#b48ead;">bool </span><span>{
</span><span> </span><span style="color:#bf616a;">self</span><span>.is_idle
</span><span> }
</span><span>}
</span></code></pre>
<p>At the bottom of our <code>Handler::ready()</code> method, we need to mark the connection as being idle:</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#65737e;">// self.token is our `Server` token. we do not want to mark that idle
</span><span style="color:#b48ead;">if </span><span style="color:#bf616a;">self</span><span>.token != token {
</span><span> </span><span style="color:#bf616a;">self</span><span>.</span><span style="color:#96b5b4;">find_connection_by_token</span><span>(token).</span><span style="color:#96b5b4;">mark_idle</span><span>();
</span><span>}
</span></code></pre>
<p>Our <code>Handler::tick()</code> method will now need to reregister any connection that is in an idle state. We can add combine the check for reregisration with the check for reset connections in the same loop. We end up with:</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">tick</span><span>(&</span><span style="color:#b48ead;">mut </span><span style="color:#bf616a;">self</span><span>, </span><span style="color:#bf616a;">event_loop</span><span>: &</span><span style="color:#b48ead;">mut </span><span>EventLoop<Server>) {
</span><span> trace!("</span><span style="color:#a3be8c;">Handling end of tick</span><span>");
</span><span>
</span><span> </span><span style="color:#b48ead;">let mut</span><span> reset_tokens = Vec::new();
</span><span>
</span><span> </span><span style="color:#b48ead;">for</span><span> c in </span><span style="color:#bf616a;">self</span><span>.conns.</span><span style="color:#96b5b4;">iter_mut</span><span>() {
</span><span> </span><span style="color:#b48ead;">if</span><span> c.</span><span style="color:#96b5b4;">is_reset</span><span>() {
</span><span> reset_tokens.</span><span style="color:#96b5b4;">push</span><span>(c.token);
</span><span> } </span><span style="color:#b48ead;">else if</span><span> c.</span><span style="color:#96b5b4;">is_idle</span><span>() {
</span><span> c.</span><span style="color:#96b5b4;">reregister</span><span>(event_loop)
</span><span> .</span><span style="color:#96b5b4;">unwrap_or_else</span><span>(|</span><span style="color:#bf616a;">e</span><span>| {
</span><span> warn!("</span><span style="color:#a3be8c;">Reregister failed {:?}</span><span>", e);
</span><span> c.</span><span style="color:#96b5b4;">mark_reset</span><span>();
</span><span> reset_tokens.</span><span style="color:#96b5b4;">push</span><span>(c.token);
</span><span> });
</span><span> }
</span><span> }
</span><span>
</span><span> </span><span style="color:#b48ead;">for</span><span> token in reset_tokens {
</span><span> </span><span style="color:#b48ead;">match </span><span style="color:#bf616a;">self</span><span>.conns.</span><span style="color:#96b5b4;">remove</span><span>(token) {
</span><span> Some(_c) => {
</span><span> debug!("</span><span style="color:#a3be8c;">reset connection; token={:?}</span><span>", token);
</span><span> }
</span><span> None => {
</span><span> warn!("</span><span style="color:#a3be8c;">Unable to remove connection for {:?}</span><span>", token);
</span><span> }
</span><span> }
</span><span> }
</span><span>}
</span></code></pre>
<p>The full code can be found here: <a href="https://github.com/hjr3/mob/tree/state-blog-post">https://github.com/hjr3/mob/tree/state-blog-post</a>.</p>
<p>We now have four variables tracking various parts of our <code>Connection</code> state: <code>is_reset</code>, <code>is_idle</code>, <code>read_continuation</code> and <code>write_continuation</code>. The latter two being discussed in a <a href="/2015/09/12/creating-a-simple-protocol-when-using-rust-and-mio.html">previous blog post</a>. There is some overlap amongst these variables and I am thinking about how to represent all this state with one <em>state</em> variable on the <code>Connection</code> class.</p>
<p>We are also doing a loop over the connection slab for each event loop tick. This can get heavy if we have a lot of connections in the slab. Usually connections are not dropping off that often and we if we are not under load we may not have many connections eligible to be reregistered. Right now, I am willing to take the perf hit in order to not crash. However, I am thinking about ways to accomplish the safety without having to loop so often.</p>
<p>While the soluations may not be ideal, I think it is worth talking about some of the challenges I faced getting mob working on top of mio. Some of the answers have been organic in nature and I will continue to improve them as I learn more.</p>
http://activitystrea.ms/schema/1.0/postGet Data From A URL In Rust2015-09-21T00:00:00+00:002015-09-21T00:00:00+00:00Herman J. Radtke IIIhttp://activitystrea.ms/schema/1.0/personhttps://hermanradtke.com/about/https://hermanradtke.com/2015/09/21/get-data-from-a-url-rust.html/<p><strong>Tested on Rust 1.3</strong></p>
<p>Here is a high level of example of how to make a HTTP GET request to some URL. To make the example a little more interesting, the URL will have a json response body. We will parse the body and pluck some values from the parsed json.</p>
<p>There are a number of crates we could use to make an HTTP GET request, but I am partial to <a href="https://crates.io/crates/curl">curl</a>. The curl library should be familiar to a wide set of audiences and libcurl is rock solid. Also, I think the Rust interface to curl is really easy to read and use. I am going to request the <a href="https://www.hautelook.com">HauteLook</a> API root because that is where I work and it will return <a href="http://stateless.co/hal_specification.html">Hal</a> json.</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">use </span><span>curl::http;
</span><span>
</span><span style="color:#b48ead;">let</span><span> url = "</span><span style="color:#a3be8c;">https://www.hautelook.com/api</span><span>";
</span><span> </span><span style="color:#b48ead;">let</span><span> resp = http::handle()
</span><span> .</span><span style="color:#96b5b4;">get</span><span>(url)
</span><span> .</span><span style="color:#96b5b4;">exec</span><span>()
</span><span> .</span><span style="color:#96b5b4;">unwrap_or_else</span><span>(|</span><span style="color:#bf616a;">e</span><span>| {
</span><span> panic!("</span><span style="color:#a3be8c;">Failed to get {}; error is {}</span><span>", url, e);
</span><span> });
</span></code></pre>
<p>A few things to point out. The curl crate allows the functions to be chained together so it reads really nice. We can map the methods directly to the curl C interface:</p>
<ul>
<li><code>http::handle()</code> -> <code>curl_easy_init()</code></li>
<li><code>get(url)</code> -> <code>curl_easy_setopt(handle, CURLOPT_URL, url);</code></li>
<li><code>exec()</code> -> <code>curl_easy_perform(handle);</code></li>
</ul>
<p>The C interface would normally require us to explicitly close the handle, but Rust does this automatically for us. In Rust, we also need to unwrap the <code>Result<Response, ErrCode></code> returned by the call to <code>exec()</code>. Rather than just use <code>unwrap()</code>, we can use <code>unwrap_or_else()</code> and generate a more user-friendly error message. I will be using <code>unwrap_or_else()</code> throughout this example.</p>
<p>Now that we have a response, we need to parse the json. Again, there are a number of crates we can use for this task. Let us choose <a href="https://crates.io/crates/serde_json">serde_json</a> as that looks to be the successor to <a href="https://crates.io/crates/rustc-serialize">rustc_serialize</a>. Before we start parsing json, we need to get at the response body. In curl, <code>resp.get_body()</code> will return a reference to a slice of unsigned 8 bit intgers <code>&[u8]</code>. We need to turn those bytes into a <a href="https://doc.rust-lang.org/nightly/std/str/index.html">unicode string slice</a>.</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">let</span><span> body = std::str::from_utf8(resp.</span><span style="color:#96b5b4;">get_body</span><span>()).</span><span style="color:#96b5b4;">unwrap_or_else</span><span>(|</span><span style="color:#bf616a;">e</span><span>| {
</span><span> panic!("</span><span style="color:#a3be8c;">Failed to parse response from {}; error is {}</span><span>", url, e);
</span><span>});
</span></code></pre>
<p>Now that we have our string slice, we can attempt to parse than string into a json <code>Value</code> type. This type will allow us to access specific fields within the json data.</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">let</span><span> json: Value = serde_json::from_str(body).</span><span style="color:#96b5b4;">unwrap_or_else</span><span>(|</span><span style="color:#bf616a;">e</span><span>| {
</span><span> panic!("</span><span style="color:#a3be8c;">Failed to parse json; error is {}</span><span>", e);
</span><span>});
</span></code></pre>
<p>Let us take a look at the json response before we look at the code to pluck values from it. Without going into specifics about Hal or hypermedia, we have a json object that contains one key named <code>_links</code>. This key <code>_links</code> has a number of <a href="http://www.iana.org/assignments/link-relations/link-relations.xhtml">link relations</a> that correspond to an object that contains an <code>href</code>.</p>
<pre data-lang="json" style="background-color:#2b303b;color:#c0c5ce;" class="language-json "><code class="language-json" data-lang="json"><span>{
</span><span> "</span><span style="color:#a3be8c;">_links</span><span>": {
</span><span> "</span><span style="color:#a3be8c;">http://hautelook.com/rels/events</span><span>": {
</span><span> "</span><span style="color:#a3be8c;">href</span><span>": "</span><span style="color:#a3be8c;">https://www.hautelook.com/v4/events</span><span>"
</span><span> },
</span><span> "</span><span style="color:#a3be8c;">http://hautelook.com/rels/image-resizer</span><span>": {
</span><span> "</span><span style="color:#a3be8c;">href</span><span>": "</span><span style="color:#a3be8c;">https://www.hautelook.com/resizer/{width}x{height}/{imgPath}</span><span>",
</span><span> "</span><span style="color:#a3be8c;">templated</span><span>": </span><span style="color:#d08770;">true
</span><span> },
</span><span> "</span><span style="color:#a3be8c;">http://hautelook.com/rels/login</span><span>": {
</span><span> "</span><span style="color:#a3be8c;">href</span><span>": "</span><span style="color:#a3be8c;">https://www.hautelook.com/api/login</span><span>"
</span><span> },
</span><span> "</span><span style="color:#a3be8c;">http://hautelook.com/rels/login/soft</span><span>": {
</span><span> "</span><span style="color:#a3be8c;">href</span><span>": "</span><span style="color:#a3be8c;">https://www.hautelook.com/api/login/soft</span><span>"
</span><span> },
</span><span> "</span><span style="color:#a3be8c;">http://hautelook.com/rels/members</span><span>": {
</span><span> "</span><span style="color:#a3be8c;">href</span><span>": "</span><span style="color:#a3be8c;">https://www.hautelook.com/v4/members</span><span>"
</span><span> },
</span><span> "</span><span style="color:#a3be8c;">http://hautelook.com/rels/search2</span><span>": {
</span><span> "</span><span style="color:#a3be8c;">href</span><span>": "</span><span style="color:#a3be8c;">https://www.hautelook.com/api/search2/catalog</span><span>"
</span><span> },
</span><span> "</span><span style="color:#a3be8c;">profile</span><span>": {
</span><span> "</span><span style="color:#a3be8c;">href</span><span>": "</span><span style="color:#a3be8c;">https://www.hautelook.com/api/doc</span><span>"
</span><span> },
</span><span> "</span><span style="color:#a3be8c;">self</span><span>": {
</span><span> "</span><span style="color:#a3be8c;">href</span><span>": "</span><span style="color:#a3be8c;">https://www.hautelook.com/api</span><span>"
</span><span> }
</span><span> }
</span><span>}
</span></code></pre>
<p>Let us write some code to print out each link relation with the corresponding href value. This will involve us first getting <code>_links</code> and then iterating over the link releations inside of <code>_links</code>.</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">let</span><span> links = json.</span><span style="color:#96b5b4;">as_object</span><span>()
</span><span> .</span><span style="color:#96b5b4;">and_then</span><span>(|</span><span style="color:#bf616a;">object</span><span>| object.</span><span style="color:#96b5b4;">get</span><span>("</span><span style="color:#a3be8c;">_links</span><span>"))
</span><span> .</span><span style="color:#96b5b4;">and_then</span><span>(|</span><span style="color:#bf616a;">links</span><span>| links.</span><span style="color:#96b5b4;">as_object</span><span>())
</span><span> .</span><span style="color:#96b5b4;">unwrap_or_else</span><span>(|| {
</span><span> panic!("</span><span style="color:#a3be8c;">Failed to get '_links' value from json</span><span>");
</span><span> });
</span><span>
</span><span style="color:#b48ead;">for </span><span>(rel, link) in links.</span><span style="color:#96b5b4;">iter</span><span>() {
</span><span> </span><span style="color:#b48ead;">let</span><span> href = link.</span><span style="color:#96b5b4;">find</span><span>("</span><span style="color:#a3be8c;">href</span><span>")
</span><span> .</span><span style="color:#96b5b4;">and_then</span><span>(|</span><span style="color:#bf616a;">value</span><span>| value.</span><span style="color:#96b5b4;">as_string</span><span>())
</span><span> .</span><span style="color:#96b5b4;">unwrap_or_else</span><span>(|| {
</span><span> panic!("</span><span style="color:#a3be8c;">Failed to get 'href' value from within '_links'</span><span>");
</span><span> });
</span><span>
</span><span> println!("</span><span style="color:#d08770;">{}</span><span style="color:#a3be8c;"> -> </span><span style="color:#d08770;">{}</span><span>", rel, href);
</span><span>}
</span></code></pre>
<p>In serde, the <code>Value</code> type represents all possible json values. Before we can do something meaningful, we must convert the value to a more specific json type. Since our json starts out as an object with one key, we need to first use the <code>as_object()</code> function. The <code>as_object()</code> function will convert the <code>Value</code> into a <code>BTreeMap</code> type. We can then use the <code>get</code> function that comes with <code>BTreeMap</code> to get at our link relations. I am using the <code>and_then()</code> funtion avoid dealing with <code>unwrap()</code> over and over. I could have also written the code to get <code>links</code> like this:</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">let</span><span> oject = json.</span><span style="color:#96b5b4;">as_object</span><span>().</span><span style="color:#96b5b4;">unwrap</span><span>();
</span><span style="color:#b48ead;">let</span><span> links_value = object.</span><span style="color:#96b5b4;">get</span><span>("</span><span style="color:#a3be8c;">_links</span><span>").</span><span style="color:#96b5b4;">unwrap</span><span>();
</span><span style="color:#b48ead;">let</span><span> links = links.</span><span style="color:#96b5b4;">as_object</span><span>().</span><span style="color:#96b5b4;">unwrap</span><span>();
</span></code></pre>
<p>Since <code>links</code> is just a BTreeMap, we can iterate over all the key value pairs using <code>links.iter()</code>. The link relation, <code>rel</code>, is the key and the <code>link</code> is the value. I am using the <code>find()</code> function to get the <code>href</code> out of the <code>link</code>. The <code>find()</code> function basically combines <code>as_object()</code> and <code>get()</code>. In order to get the actual URL string, we need to use the <code>as_string()</code> function. All the functions to convert <code>Value</code> to a more specific type are <a href="https://github.com/serde-rs/json/blob/e950b51a773a48281ad943c1bbf8c67fc266804a/json/src/value.rs#L147">here</a>. There are also some more advanced functions like <code>lookup()</code> and <code>search()</code>.</p>
<p>Here is the code in its entirety:</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">extern crate</span><span> curl;
</span><span style="color:#b48ead;">extern crate</span><span> serde_json;
</span><span>
</span><span style="color:#b48ead;">use </span><span>curl::http;
</span><span style="color:#b48ead;">use </span><span>serde_json::Value;
</span><span>
</span><span style="color:#b48ead;">pub fn </span><span style="color:#8fa1b3;">main</span><span>() {
</span><span>
</span><span> </span><span style="color:#b48ead;">let</span><span> url = "</span><span style="color:#a3be8c;">https://www.hautelook.com/api</span><span>";
</span><span> </span><span style="color:#b48ead;">let</span><span> resp = http::handle()
</span><span> .</span><span style="color:#96b5b4;">get</span><span>(url)
</span><span> .</span><span style="color:#96b5b4;">exec</span><span>()
</span><span> .</span><span style="color:#96b5b4;">unwrap_or_else</span><span>(|</span><span style="color:#bf616a;">e</span><span>| {
</span><span> panic!("</span><span style="color:#a3be8c;">Failed to get {}; error is {}</span><span>", url, e);
</span><span> });
</span><span>
</span><span> </span><span style="color:#b48ead;">if</span><span> resp.</span><span style="color:#96b5b4;">get_code</span><span>() != </span><span style="color:#d08770;">200 </span><span>{
</span><span> println!("</span><span style="color:#a3be8c;">Unable to handle HTTP response code </span><span style="color:#d08770;">{}</span><span>", resp.</span><span style="color:#96b5b4;">get_code</span><span>());
</span><span> </span><span style="color:#b48ead;">return</span><span>;
</span><span> }
</span><span>
</span><span> </span><span style="color:#b48ead;">let</span><span> body = std::str::from_utf8(resp.</span><span style="color:#96b5b4;">get_body</span><span>()).</span><span style="color:#96b5b4;">unwrap_or_else</span><span>(|</span><span style="color:#bf616a;">e</span><span>| {
</span><span> panic!("</span><span style="color:#a3be8c;">Failed to parse response from {}; error is {}</span><span>", url, e);
</span><span> });
</span><span>
</span><span> </span><span style="color:#b48ead;">let</span><span> json: Value = serde_json::from_str(body).</span><span style="color:#96b5b4;">unwrap_or_else</span><span>(|</span><span style="color:#bf616a;">e</span><span>| {
</span><span> panic!("</span><span style="color:#a3be8c;">Failed to parse json; error is {}</span><span>", e);
</span><span> });
</span><span>
</span><span> </span><span style="color:#b48ead;">let</span><span> links = json.</span><span style="color:#96b5b4;">as_object</span><span>()
</span><span> .</span><span style="color:#96b5b4;">and_then</span><span>(|</span><span style="color:#bf616a;">object</span><span>| object.</span><span style="color:#96b5b4;">get</span><span>("</span><span style="color:#a3be8c;">_links</span><span>"))
</span><span> .</span><span style="color:#96b5b4;">and_then</span><span>(|</span><span style="color:#bf616a;">links</span><span>| links.</span><span style="color:#96b5b4;">as_object</span><span>())
</span><span> .</span><span style="color:#96b5b4;">unwrap_or_else</span><span>(|| {
</span><span> panic!("</span><span style="color:#a3be8c;">Failed to get '_links' value from json</span><span>");
</span><span> });
</span><span>
</span><span> </span><span style="color:#b48ead;">for </span><span>(rel, link) in links.</span><span style="color:#96b5b4;">iter</span><span>() {
</span><span> </span><span style="color:#b48ead;">let</span><span> href = link.</span><span style="color:#96b5b4;">find</span><span>("</span><span style="color:#a3be8c;">href</span><span>")
</span><span> .</span><span style="color:#96b5b4;">and_then</span><span>(|</span><span style="color:#bf616a;">value</span><span>| value.</span><span style="color:#96b5b4;">as_string</span><span>())
</span><span> .</span><span style="color:#96b5b4;">unwrap_or_else</span><span>(|| {
</span><span> panic!("</span><span style="color:#a3be8c;">Failed to get 'href' value from within '_links'</span><span>");
</span><span> });
</span><span>
</span><span> println!("</span><span style="color:#d08770;">{}</span><span style="color:#a3be8c;"> -> </span><span style="color:#d08770;">{}</span><span>", rel, href);
</span><span> }
</span><span>}
</span></code></pre>
<p>We now have all the knowledge we need to work with URLs that return a json response. I put the complete <a href="https://github.com/hjr3/rust-get-data-from-url">working example</a> on github.</p>
http://activitystrea.ms/schema/1.0/postCreating A Simple Protocol When Using Rust and mio2015-09-12T00:00:00+00:002015-09-12T00:00:00+00:00Herman J. Radtke IIIhttp://activitystrea.ms/schema/1.0/personhttps://hermanradtke.com/about/https://hermanradtke.com/2015/09/12/creating-a-simple-protocol-when-using-rust-and-mio.html/<p>This post is going to walk through establishing a simple protocol when using mio.</p>
<p>Let us first talk about why a protocol is needed. There are two common network protocols in use today: UDP and TCP. UDP is a message oriented protocol that delivers the message in one chunk. The downside to UDP is that there is no guarantee of message delivery because UDP does not handle packet loss. Many people want to protect against packet loss so they choose TCP instead. TCP is a stream oriented protocol. Data is sent byte by byte. A "message" may come one byte at a time, in multi-byte chunks or all at once. The only thing we can count on with TCP is that the bytes will arrive in the same order they were sent. And here is the reason we need a higher level protocol: It is task of the receiving socket to determine when it has enough data to make any sense of it.</p>
<p>I have seen two basic approaches to building a higher level protocol. The HTTP standard uses both, so let us look at how it works. An HTTP request is split into two parts: a header section and a body section. The header section contains meta information, mostly in the form of headers, used to precisely describe the request. We do not know ahead of time how long a header is or how many headers a request sends. However, HTTP uses <code>\r\n</code> to signal the end of the header. Within the header section is the <em>Content-Length</em> header that specifies how many bytes the body section is. So one approach is to use a marker, such as <code>\r\n</code>, to signal the end of the message. Another approach is to explicitly specify how many bytes to read. HTTP also has a <a href="https://en.wikipedia.org/wiki/Chunked_transfer_encoding">chunked transfer encoding</a> option in in HTTP 1.1 that combines both of these approaches to read the body section.</p>
<p>There are some really powerful tools for building protocols, such as <a href="https://capnproto.org/">capnproto</a>. I wanted something very simple that I could implement. I decided to tell the receiver how many bytes of data they should be expecting. To do this, I use the first 64 bits to specify how many bytes I am sending over the wire. My custom protocol is not <em>discoverable</em>. Both the sender and receiver have to agree ahead of time on this protocol and implement it.</p>
<p>The basic strategy for receiving is as follows:</p>
<ol>
<li>Read the first 64 bits from the socket.</li>
<li>Convert those bits into a <code>u64</code> type and determine the length of the message.</li>
<li>Read <code>message_length</code> bytes from the socket.</li>
</ol>
<p>Either of the reads can receieve <code>WouldBlock</code> which, <a href="/2015/07/12/my-basic-understanding-of-mio-and-async-io.html#i-would-block-you">we know</a>, means we have to try again later. This is not a problem for our first read of the 64 bytes. However, if we receive <code>WouldBlock</code> during the second read then we have to remember to not try and read the first 64 bytes from the socket when we try again. This means we have to keep some state around reads. We need to keep track of two peices of information. The first is whether or not we are in the middle of reading. The second is if we are in the middle of reading then we need to keep track of how many bytes the message is. I added <code>read_continuation: Option<u64></code> to my <code>Connection</code> struct to capture this.</p>
<p>Here is how we read the message length:</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">read_message_length</span><span>(&</span><span style="color:#b48ead;">mut </span><span style="color:#bf616a;">self</span><span>) -> io::Result<Option<</span><span style="color:#b48ead;">u64</span><span>>> {
</span><span> </span><span style="color:#b48ead;">if let </span><span>Some(n) = </span><span style="color:#bf616a;">self</span><span>.read_continuation {
</span><span> </span><span style="color:#b48ead;">return </span><span>Ok(Some(n));
</span><span> }
</span><span>
</span><span> </span><span style="color:#b48ead;">let mut</span><span> buf = [</span><span style="color:#d08770;">0</span><span style="color:#b48ead;">u8</span><span>; </span><span style="color:#d08770;">8</span><span>];
</span><span>
</span><span> </span><span style="color:#b48ead;">let</span><span> bytes = </span><span style="color:#b48ead;">match </span><span style="color:#bf616a;">self</span><span>.sock.</span><span style="color:#96b5b4;">try_read</span><span>(&</span><span style="color:#b48ead;">mut</span><span> buf) {
</span><span> Ok(None) => {
</span><span> </span><span style="color:#b48ead;">return </span><span>Ok(None);
</span><span> },
</span><span> Ok(Some(n)) => n,
</span><span> Err(e) => {
</span><span> </span><span style="color:#b48ead;">return </span><span>Err(e);
</span><span> }
</span><span> };
</span><span>
</span><span> </span><span style="color:#b48ead;">if</span><span> bytes < </span><span style="color:#d08770;">8 </span><span>{
</span><span> warn!("</span><span style="color:#a3be8c;">Found message length of {} bytes</span><span>", bytes);
</span><span> </span><span style="color:#b48ead;">return </span><span>Err(Error::new(ErrorKind::InvalidData, "</span><span style="color:#a3be8c;">Invalid message length</span><span>"));
</span><span> }
</span><span>
</span><span> </span><span style="color:#b48ead;">let</span><span> msg_len = BigEndian::read_u64(buf.</span><span style="color:#96b5b4;">as_ref</span><span>());
</span><span> Ok(Some(msg_len))
</span><span>}
</span><span>
</span></code></pre>
<p>The function starts out by checking if we are in the middle of a read. If we are in the middle of a read, we already know the message length and can just return it immediately. Otherwise, I try to read 8 bytes from the socket. The <code>try_read</code> function is provided by <a href="https://github.com/carllerche/mio/blob/272fb3d06e8f7134c9611e1877b3ff71642ced67/src/io.rs#L57">mio</a> and will return <code>Ok(None)</code> on <code>WouldBlock</code>. If the read fails or less than 8 bytes were received, we return an error that will cause this connection to be reset. Finally, I use the <a href="https://crates.io/crates/byteorder">byteorder</a> crate to convert the bytes into a <code>u64</code> that will tell us how long the message is.</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">pub fn </span><span style="color:#8fa1b3;">readable</span><span>(&</span><span style="color:#b48ead;">mut </span><span style="color:#bf616a;">self</span><span>) -> io::Result<Option<Vec<</span><span style="color:#b48ead;">u8</span><span>>>> {
</span><span>
</span><span> </span><span style="color:#b48ead;">let</span><span> msg_len = </span><span style="color:#b48ead;">match </span><span>try!(</span><span style="color:#bf616a;">self</span><span>.</span><span style="color:#96b5b4;">read_message_length</span><span>()) {
</span><span> None => { </span><span style="color:#b48ead;">return </span><span>Ok(None); },
</span><span> Some(n) => n,
</span><span> };
</span><span>
</span><span> debug!("</span><span style="color:#a3be8c;">Expected message length: {}</span><span>", msg_len);
</span><span> </span><span style="color:#b48ead;">let mut</span><span> recv_buf : Vec<</span><span style="color:#b48ead;">u8</span><span>> = Vec::with_capacity(msg_len as </span><span style="color:#b48ead;">usize</span><span>);
</span><span>
</span><span> </span><span style="color:#65737e;">// resolve "multiple applicable items in scope [E0034]" error
</span><span> </span><span style="color:#b48ead;">let</span><span> sock_ref = <TcpStream as Read>::by_ref(&</span><span style="color:#b48ead;">mut </span><span style="color:#bf616a;">self</span><span>.sock);
</span><span>
</span><span> </span><span style="color:#b48ead;">match</span><span> sock_ref.</span><span style="color:#96b5b4;">take</span><span>(msg_len as </span><span style="color:#b48ead;">u64</span><span>).</span><span style="color:#96b5b4;">try_read_buf</span><span>(&</span><span style="color:#b48ead;">mut</span><span> recv_buf) {
</span><span> Ok(None) => {
</span><span> debug!("</span><span style="color:#a3be8c;">CONN : read encountered WouldBlock</span><span>");
</span><span>
</span><span> </span><span style="color:#65737e;">// We are being forced to try again, but we already read the two bytes off of
</span><span> </span><span style="color:#65737e;">// the wire that determined the length. We need to store the message length
</span><span> </span><span style="color:#65737e;">// so we can resume next time we get readable.
</span><span> </span><span style="color:#bf616a;">self</span><span>.read_continuation = Some(msg_len as </span><span style="color:#b48ead;">u64</span><span>);
</span><span> Ok(None)
</span><span> },
</span><span> Ok(Some(n)) => {
</span><span> debug!("</span><span style="color:#a3be8c;">CONN : we read {} bytes</span><span>", n);
</span><span>
</span><span> </span><span style="color:#b48ead;">if</span><span> n < msg_len as </span><span style="color:#b48ead;">usize </span><span>{
</span><span> </span><span style="color:#b48ead;">return </span><span>Err(Error::new(ErrorKind::InvalidData, "</span><span style="color:#a3be8c;">Did not read enough bytes</span><span>"));
</span><span> }
</span><span>
</span><span> </span><span style="color:#bf616a;">self</span><span>.read_continuation = None;
</span><span>
</span><span> Ok(Some(recv_buf))
</span><span> },
</span><span> Err(e) => {
</span><span> error!("</span><span style="color:#a3be8c;">Failed to read buffer for token {:?}, error: {}</span><span>", </span><span style="color:#bf616a;">self</span><span>.token, e);
</span><span> Err(e)
</span><span> }
</span><span> }
</span><span>}
</span></code></pre>
<p>Our <code>readable</code> function starts out by determining the length of the message and then creates a vector with a capacity that is at least message length. I would have preferred a fixed slice, but I do not know of a way to create that slice dynamically. Then we need to read at <em>most</em> <code>msg_len</code> bytes from the socket. We can do this using the <code>take</code> function. This starts to look a bit messy due to some Rust issues. If we just call <code>self.sock.by_ref()</code> Rust is not able to determine which <code>by_ref</code> function to use. The error message looks something like:</p>
<pre style="background-color:#2b303b;color:#c0c5ce;"><code><span>src/connection.rs:76:25: 76:33 error: multiple applicable items in scope [E0034]
</span><span>src/connection.rs:76 match self.sock.by_ref().take(msg_len as u64).try_read_buf(&mut recv_buf) {
</span><span> ^~~~~~~~
</span><span>src/connection.rs:76:25: 76:33 help: run `rustc --explain E0034` to see a detailed explanation
</span><span>src/connection.rs:76:25: 76:33 note: candidate #1 is defined in an impl of the trait `std::io::Read` for the type `&mut _`
</span><span>src/connection.rs:76 match self.sock.by_ref().take(msg_len as u64).try_read_buf(&mut recv_buf) {
</span><span> ^~~~~~~~
</span><span>src/connection.rs:76:25: 76:33 note: candidate #2 is defined in an impl of the trait `std::io::Write` for the type `&mut _`
</span><span>src/connection.rs:76 match self.sock.by_ref().take(msg_len as u64).try_read_buf(&mut recv_buf) {
</span><span> ^~~~~~~~
</span><span>src/connection.rs:76:25: 76:33 note: candidate #3 is defined in an impl of the trait `core::iter::Iterator` for the type `&mut _`
</span><span>src/connection.rs:76 match self.sock.by_ref().take(msg_len as u64).try_read_buf(&mut recv_buf) {
</span><span> ^~~~~~~~
</span><span>src/connection.rs:76:25: 76:33 note: candidate #4 is defined in an impl of the trait `std::io::Read` for the type `mio::net::tcp::TcpStream`
</span><span>src/connection.rs:76 match self.sock.by_ref().take(msg_len as u64).try_read_buf(&mut recv_buf) {
</span><span> ^~~~~~~~
</span><span>src/connection.rs:76:25: 76:33 note: candidate #5 is defined in an impl of the trait `std::io::Write` for the type `mio::net::tcp::TcpStream`
</span><span>src/connection.rs:76 match self.sock.by_ref().take(msg_len as u64).try_read_buf(&mut recv_buf) {
</span></code></pre>
<p>In order to resolve this, we need to use <a href="https://doc.rust-lang.org/book/ufcs.html">Universal Function Call Syntax</a>, also called UFCS. Using UFCS, we can be explicit about which <code>by_ref</code> function we want. We can then use that reference to <code>take</code> at <em>most</em> <code>msg_len</code> bytes from the socket. Now we just need to handle the the different responses from the socket. If <code>try_read</code> returns <code>None</code> (meaning <code>WouldBlock</code>), then we need to store the length of the message in <code>self.read_continuation</code> so we can try again later. If we successfully read from the socket, we set <code>self.read_continuation</code> to <code>None</code> so the next readable event will know to first determine the message length.</p>
<p>I have tested this a fair bit and find it works well. The fact that mob echos every received message to every connected socket causes messages to naturally coalecse. Knowing the message length ahead of time helps separate the messages out. The write strategy is similar to the read strategy that I will not go over it here. The working code is located on the <a href="https://github.com/hjr3/mob/tree/protocol-blog-post">on github</a>, so please use that as a reference for the write strategy if you are curious. Having a basic protocol like this is exiciting as it will set us up to handle sending or receiving json, xml or other data format later on.</p>
<h2 id="related"><a class="zola-anchor" href="#related" aria-label="Anchor link for: related">Related</a></h2>
<ul>
<li><a href="/2015/07/12/my-basic-understanding-of-mio-and-async-io.html">Creating A Multi-echo Server using Rust and mio</a></li>
</ul>
http://activitystrea.ms/schema/1.0/postUsing Docker to Test Rust on Linux2015-08-23T00:00:00+00:002015-08-23T00:00:00+00:00Herman J. Radtke IIIhttp://activitystrea.ms/schema/1.0/personhttps://hermanradtke.com/about/https://hermanradtke.com/2015/08/23/using-docker-to-test-rust-on-linux.html/<p>I use a MacBook Air as my main laptop. I have been using Vagrant to test rust programs on linux. This feels a little heavy to me though as I have to create a Vagrant machine for each repository. The up and halt phases of the Vagrant box are a little slow and each machine eats away at my available hard drive space. I do not need an entire virutalized operating system, just a place to test my programs. This seems like a good use case for <a href="https://www.docker.com/">Docker</a>.</p>
<p>The first thing I had to do was get <a href="http://boot2docker.io/">boot2docker</a> installed. This was pretty straight-forward, but did take a fair bit of time. The second thing was finding an upstream Rust container (I think this is called an image) to use. I do not want to build one myself. Doing a <a href="https://hub.docker.com/search/?q=rust&page=1&isAutomated=0&isOfficial=0&starCount=0&pullCount=0">search on Docker Hub</a> I chose the <a href="https://hub.docker.com/r/schickling/rust/">schickling/rust</a> container. The repo info had a simple walkthrough, the Dockerfile itself seemed straightforward and it included gdb.</p>
<p>This container is setup to be used interatively or to run commands. To run cargo tests:</p>
<pre data-lang="bash" style="background-color:#2b303b;color:#c0c5ce;" class="language-bash "><code class="language-bash" data-lang="bash"><span style="color:#bf616a;">$</span><span> docker run</span><span style="color:#bf616a;"> --rm -it -v </span><span>$(</span><span style="color:#bf616a;">pwd</span><span>):/source schickling/rust cargo test
</span></code></pre>
<p>Let us go over the command. If you are familiar with Docker, you can skip this section. To start, <code>docker run</code> runs a command in a new contianer. The <code>--rm</code>, <code>-it</code> and <code>-v</code> flags are very common when running Docker containers. Here is what they mean:</p>
<ul>
<li><code>--rm</code> - automatically remove the container when it exists. This means when the command is over, the container will stop running. This removes the <em>container</em>, but not the <code>schickling/rust</code> <em>image</em>.</li>
<li><code>-it</code> - make the docker container interactive and allocate a tty. This basically means your shell will work.</li>
<li><code>-v</code> - mount a volume. In the example above, it binds the local <em>present working directory</em> to the <code>/source</code> directory inside the container.</li>
</ul>
<p>After the flags, we specify the upstream Docker container name <code>schickling/rust</code> and then finally our command.</p>
<p>If you want to run experiment inside the linux container, just omit a command. The docker file used to build<code>schickling/rust</code> specifies <code>bash</code> as the default command.</p>
<pre data-lang="bash" style="background-color:#2b303b;color:#c0c5ce;" class="language-bash "><code class="language-bash" data-lang="bash"><span style="color:#bf616a;">$</span><span> docker run</span><span style="color:#bf616a;"> --rm -it -v </span><span>$(</span><span style="color:#bf616a;">pwd</span><span>):/source schickling/rust cargo test
</span></code></pre>
<p>This is basically like a <code>vagrant ssh</code> command. You will be given a shell inside the container. I do this when playing with <a href="https://github.com/hjr3/mob">mob</a> because I want to run the server, then the client in various ways. Example:</p>
<pre data-lang="bash" style="background-color:#2b303b;color:#c0c5ce;" class="language-bash "><code class="language-bash" data-lang="bash"><span style="color:#bf616a;">$</span><span> docker run</span><span style="color:#bf616a;"> --rm -it -v </span><span>$(</span><span style="color:#bf616a;">pwd</span><span>):/source schickling/rust
</span><span style="color:#bf616a;">root@f237a067addb:/source#</span><span> cargo build
</span><span style="color:#bf616a;">root@f237a067addb:/source#</span><span> RUST_LOG=trace ./target/debug/mob-server </span><span style="color:#d08770;">2</span><span>> server.log &
</span><span style="color:#bf616a;">root@f237a067addb:/source#</span><span> ./target/debug/mob-client
</span><span style="color:#bf616a;">root@f237a067addb:/source#</span><span> fg
</span></code></pre>
<p>Make sure you run <code>cargo build</code> inside the container as Linux cannot run the executable built on OS X. If you see the error <code>bash: ./target/debug/mob-server: cannot execute binary file</code> then you need to <code>cargo build</code>. I then run the mob server in the background and send the log output to a file. I run the mob client (sometimes repeatedly). When done, I use the <code>fg</code> command to bring the mob server process back into the foreground where I can terminate it (using Ctrl-C). I then exit the container (using Ctrl-D), the container is cleaned up and I can start fresh again if I want.</p>
<p>Docker was fairly easy to get setup and I have found it to be more efficient for these types of use-cases. The hardest part was getting <a href="http://boot2docker.io/">boot2docker</a> installed correctly. Docker will not completely replace Vagrant on my machine, but it certainly has found a place.</p>
http://activitystrea.ms/schema/1.0/postCreating a PHP Extension to Rust2015-08-03T00:00:00+00:002015-08-03T00:00:00+00:00Herman J. Radtke IIIhttp://activitystrea.ms/schema/1.0/personhttps://hermanradtke.com/about/https://hermanradtke.com/2015/08/03/creating-a-php-extension-to-rust.html/<p>I am going to walk through the creation of a PHP extension that works with a Rust library. I have a <a href="https://github.com/hjr3/rust-php-ext">working example</a> here. I also created a PHP extension for my <a href="https://github.com/hjr3/selecta/tree/php-ext">Rust selecta port</a>. Both examples use the same foreign function interface (ffi). I made sure to pick an example that uses strings because strings add additional complexity that numbers do not introduce.</p>
<h2 id="before-getting-started"><a class="zola-anchor" href="#before-getting-started" aria-label="Anchor link for: before-getting-started">Before Getting Started</a></h2>
<p>note: I created a <a href="https://hub.docker.com/r/hjr3/rust-php-ext/">docker container</a> that will set environment up.</p>
<p>You are going to need a development version of PHP. You can test if you have it by running:</p>
<pre style="background-color:#2b303b;color:#c0c5ce;"><code><span>$ which phpize
</span></code></pre>
<p>If the <code>which</code> command finds something like <code>/usr/local/bin/phpize</code>, you are in business. If you do not have it, I believe you can run <code>yum install php-devel</code> on CentOS or <code>apt-get install php5-dev</code> on Debian/Ubuntu. You can also compile PHP from source to get it.</p>
<h2 id="compiling-the-extension"><a class="zola-anchor" href="#compiling-the-extension" aria-label="Anchor link for: compiling-the-extension">Compiling The Extension</a></h2>
<p>Our <a href="https://github.com/hjr3/rust-php-ext/blob/master/rust/src/lib.rs">Rust library</a> exposes a single function named <code>ext_score</code>. It takes two parameters of <code>*const char</code> and returns a 64-bit floating point type (or a double). To build the Rust library:</p>
<pre style="background-color:#2b303b;color:#c0c5ce;"><code><span>$ cd rust
</span><span>$ cargo build
</span></code></pre>
<p>Our <a href="https://github.com/hjr3/rust-php-ext/blob/master/php-ext/score.c">PHP extension</a> defines a single function named <code>score</code> that will glue PHP userland to our <code>ext_score</code> Rust function. To build the PHP extension:</p>
<pre style="background-color:#2b303b;color:#c0c5ce;"><code><span>$ cd php-ext
</span><span>$ phpize
</span><span>$ ./configure --with-score=../rust/target/debug
</span><span>$ make
</span><span>$ php -d extension=modules/score.so -r "var_dump(score('vim', 'vi'));"
</span></code></pre>
<p>Now that we have a working example, we can explore what each of the files are actually doing.</p>
<h2 id="configuring-the-extension"><a class="zola-anchor" href="#configuring-the-extension" aria-label="Anchor link for: configuring-the-extension">Configuring The Extension</a></h2>
<p>I am going to dive right into the autotools stuff. I think autoconf is magic and the PHP wrappers around autoconf is <em>dark magic</em>. However, it is the biggest hurdle to getting a PHP extension working. All the stuff going on here is dense and it would take a whole blog post to go through it enough detail. You can usually get away with copy/pasting this stuff and tinkering with it so it works. I will try and touch on a number of things that have tripped me up though. If you make it through this section, the rest is easy.</p>
<p>If this is way more than you need, feel free to just start hardcoding stuff in your extension. <a href="https://github.com/hjr3/selecta/commit/b48de0ae95618447a5d237bf48e2dbd8ac45e203#diff-ec28c2fa28e17d40dbe2bee40768b51fR7">I did</a>! You can then skip down to where I start talking about the source code.</p>
<p>Here is the <code>config.m4</code> file I wrote for my extension. Let us walk through what is going on inside here.</p>
<pre style="background-color:#2b303b;color:#c0c5ce;"><code><span>PHP_ARG_WITH(score,
</span><span> [whether to enable the "score" extension],
</span><span> [ --enable-score Enable "score" extension support])
</span><span>
</span><span>if test "$PHP_SCORE" != "no"; then
</span><span>
</span><span> SEARCH_PATH="/usr/local /usr"
</span><span> SEARCH_FOR="libscore.so"
</span><span> if test -r $PHP_SCORE/$SEARCH_FOR; then # path given as parameter
</span><span> SCORE_LIB_DIR=$PHP_SCORE
</span><span> else # search default path list
</span><span> AC_MSG_CHECKING([for score files in default path])
</span><span> for i in $SEARCH_PATH ; do
</span><span> if test -r $i/lib/$SEARCH_FOR; then
</span><span> SCORE_LIB_DIR=$i
</span><span> AC_MSG_RESULT(found in $i)
</span><span> fi
</span><span> done
</span><span> fi
</span><span>
</span><span> if test -z "$SCORE_LIB_DIR"; then
</span><span> AC_MSG_RESULT([not found])
</span><span> AC_MSG_ERROR([Please reinstall the score rust library])
</span><span> fi
</span><span>
</span><span> PHP_CHECK_LIBRARY(score, ext_score,
</span><span> [
</span><span> PHP_ADD_LIBRARY_WITH_PATH(score, $SCORE_LIB_DIR, SCORE_SHARED_LIBADD)
</span><span> AC_DEFINE(HAVE_SCORE, 1, [whether ext_score function exists])
</span><span> ],[
</span><span> AC_MSG_ERROR([ext_score function not found in libscore])
</span><span> ],[
</span><span> -L$SCORE_LIB_DIR -R$SCORE_LIB_DIR
</span><span> ])
</span><span>
</span><span> PHP_SUBST(SCORE_SHARED_LIBADD)
</span><span> PHP_NEW_EXTENSION(score, score.c, $ext_shared)
</span><span>fi
</span></code></pre>
<p>The <code>config.m4</code> file is a mix of bash, some autoconf (AC) functions and some custom PHP functions. At a high level, we are writing some code to detect where our Rust library exists and then add that information into a <code>Makefile</code> that we will auto-generated. That <code>Makefile</code> is generated from a script called <code>configure</code>. The majority of the <code>configure</code> script is going to be created for us by PHP tooling. However, we need to add some extension specific information.</p>
<p>Let us start by hooking our extension into the <code>configure</code> script using <code>PHP_ARG_WITH</code>. The <code>PHP_ARG_WITH</code> function takes three parameters:</p>
<ol>
<li>The name of the extension. This will be used to determine the name of our extension variable. In this case, <code>$PHP_SCORE</code>.</li>
<li>The human readable string shown when <code>./configure --with-score</code> is run. Example: <em>checking whether to enable the "score" extension... yes, shared</em></li>
<li>The human readable string shown when <code>./configure --help</code> is run. This is why the spacing of the string is a bit odd.</li>
</ol>
<p>Now we can run <code>./configure --with-score</code> and the <code>configure</code> script will know what we are talking about. Next, we need to tell the configure script where to find our library so it can add those details to the Makefile.</p>
<h3 id="no-header-file"><a class="zola-anchor" href="#no-header-file" aria-label="Anchor link for: no-header-file">No Header File</a></h3>
<p>PHP assumes that a library comes with a header file that describes the functions a library exposes. Rust's FFI does not provide a header file. If we were working with a library, such as <a href="http://gearman.org/">gearman</a>, then we would expect <code>/usr/include/gearman.h</code> to exist. The standard PHP <code>config.m4</code> file uses this header file to check if a library is installed or not. To work around this lack of a header file, we can look for the shared object file instead: <code>SEARCH_FOR="/lib/libscore.so"</code>. Now that we have a Rust compatible file to check for, we need to start searching for it.</p>
<p>Before we start checking for our <code>libscore.so</code> shared object in commonly used directories like <code>/usr</code> and <code>/usr/local</code>, we want to first allow an override via <code>./configure --with-score=/path/to/library</code>. This is really useful when working on our Rust library in conjunction with the PHP extension. I can run <code>cargo build</code> and that will install <code>libscore.so</code> in <code>/home/herman/projects/selecta/php-ext/target/debug/</code>. I can then configure my PHP extension using <code>./configure --with-score=/home/herman/projects/selecta/php-ext/target/debug/</code>. When I specify a path like this, the path will be stored in the <code>$PHP_SCORE</code> variable. This saves us from having to repeatedly <em>install</em> our Rust library. If no override was specified, we can start searching some common places. Feel free to add more directories to search for, such as <code>/opt/local</code>.</p>
<h3 id="validating-before-linking"><a class="zola-anchor" href="#validating-before-linking" aria-label="Anchor link for: validating-before-linking">Validating Before Linking</a></h3>
<p>We have located a file called <code>libscore.so</code>, but we need to make sure it is a valid library file. The <code>PHP_CHECK_LIBRARY</code> function is used to validate our shared object contains a known function, or <em>symbol</em>. The <code>PHP_CHECK_LIBRARY</code> function takes five parameters:</p>
<ol>
<li>The name of the library. In our case <em>score</em> will be transformed into <code>-lscore</code> when compiling. Example: <code>cc -o conftest -g -O0 -Wl,-rpath,/usr/local/lib -L/usr/local/lib -lscore conftest.c</code></li>
<li>The name of the function to try and find within our <em>score</em> library.</li>
<li>The set of actions to take if the function is found. In our case, we are adding to the Makefile code to compile against our library and defining <code>HAVE_SCORE</code> which is used by the during compilation.</li>
<li>The set of actions to take if the function is not found. In our case, we are throwing an error with a human readable error message.</li>
<li>The set of extra library definitions. In our case, we are making sure the compiler knows where to find our shared object.</li>
</ol>
<p>The <code>PHP_ADD_LIBRARY_WTH_PATH</code> function takes three parameters:</p>
<ol>
<li>The name of the library.</li>
<li>The path to the library.</li>
<li>The name of a variable to store library information. We will use this with <code>PHP_SUBST</code>.</li>
</ol>
<h3 id="final-steps"><a class="zola-anchor" href="#final-steps" aria-label="Anchor link for: final-steps">Final Steps</a></h3>
<p>We are almost there!</p>
<p>The <code>PHP_SUBST</code> function adds a variable with its value into the Makefile.</p>
<p>The <code>PHP_NEW_EXTENSION</code> function takes a lot of parameters, but I am only going to go over three:</p>
<ol>
<li>The name of the extension</li>
<li>The list of sources, or files, used to build the extension.</li>
<li>Whether the extension should be dynamically loaded or statically compiled. The <code>$ext_shared</code> variables sets this to the proper value.</li>
</ol>
<h3 id="building-your-own-extension"><a class="zola-anchor" href="#building-your-own-extension" aria-label="Anchor link for: building-your-own-extension">Building Your Own Extension</a></h3>
<p>Normally, you can use the <a href="https://github.com/php/php-src/blob/master/ext/ext_skel">ext_skel</a> program to create an PHP extension out of the box. However, the <code>ext_skel</code> generated <code>config.m4</code> file makes some assumptions that Rust violates. It is a good starting point though. Change to the directory where you want the extension to be created and then run <code>ext_skel</code>:</p>
<pre style="background-color:#2b303b;color:#c0c5ce;"><code><span>$ cd /path/to/projects
</span><span>$ /path/to/php-src/ext/ext_skel --ext-name=php-rust-ext
</span></code></pre>
<p>This will create a <code>/home/herman/projects/php-rust-ext</code> directory with the following files: <code>config.m4 config.w32 tests</code>. I did not go over <code>config.w32</code> as that is for Windows and I am woefully ignorant when it comes to PHP and Windows. The <code>config.m4</code> has a lot of the comments to help you and you have my notes above to make any necessary changes.</p>
<h3 id="making-changes-to-config-m4"><a class="zola-anchor" href="#making-changes-to-config-m4" aria-label="Anchor link for: making-changes-to-config-m4">Making Changes To config.m4</a></h3>
<p>Once you think you have the <code>config.m4</code> file properly setup, run the <code>phpize</code> command. This program will add a bunch of auto-generated files to your directory. Feel free to <code>.gitignore</code> them and do not check them into version control. Most importantly, it creates our <code>configure</code> file which we will now use to generate our <code>Makefile</code>.</p>
<p>You will need to make changes to the <code>config.m4</code> to get your specific extension working with your library. If you make a change to the <code>config.m4</code> file, then make sure you run <code>phpize</code> again. If you make a change and then just run <code>./configure --with-score</code> then you will not get the benfit of your changes.</p>
<h2 id="extension-header-file"><a class="zola-anchor" href="#extension-header-file" aria-label="Anchor link for: extension-header-file">Extension Header File</a></h2>
<p>Here is the standard PHP header file for an extension. The convention is to use <code>php_[extension-name].h</code> as the name. In our case, <code>php_score.h</code>.</p>
<pre data-lang="c" style="background-color:#2b303b;color:#c0c5ce;" class="language-c "><code class="language-c" data-lang="c"><span style="color:#b48ead;">#ifndef</span><span> PHP_SCORE_H
</span><span>
</span><span style="color:#b48ead;">#define </span><span>PHP_SCORE_H
</span><span>
</span><span style="color:#b48ead;">#define </span><span>PHP_SCORE_EXTNAME "</span><span style="color:#a3be8c;">score</span><span>"
</span><span style="color:#b48ead;">#define </span><span>PHP_SCORE_EXTVER "</span><span style="color:#a3be8c;">1.0</span><span>"
</span><span>
</span><span style="color:#b48ead;">#ifdef</span><span> HAVE_CONFIG_H
</span><span style="color:#b48ead;">#include </span><span>"</span><span style="color:#a3be8c;">config.h</span><span>"
</span><span style="color:#b48ead;">#endif
</span><span>
</span><span style="color:#b48ead;">#include </span><span>"</span><span style="color:#a3be8c;">php.h</span><span>"
</span><span>
</span><span style="color:#b48ead;">extern</span><span> zend_module_entry score_module_entry;
</span><span style="color:#b48ead;">#define </span><span>phpext_score_ptr &score_module_entry
</span><span>
</span><span style="color:#65737e;">// Define our Rust foreign function interface (ffi) here
</span><span style="color:#b48ead;">extern double </span><span style="color:#8fa1b3;">ext_score</span><span>(</span><span style="color:#b48ead;">unsigned char </span><span>*, </span><span style="color:#b48ead;">unsigned </span><span style="color:#bf616a;">int</span><span>, </span><span style="color:#b48ead;">char </span><span>*, </span><span style="color:#b48ead;">unsigned </span><span style="color:#bf616a;">int</span><span>);
</span><span>
</span><span style="color:#b48ead;">#endif
</span></code></pre>
<p>You can copy/paste most of this and replace <code>SCORE</code> and <code>score</code> with the name of your extension. I chose to define the score libraries functions here. We are telling the compiler that something external to our code is defining a function named <code>ext_score</code>. This allows our code to compile successfully when we go to use this Rust function. Make sure you list all the functions your Rust library is exposing.</p>
<h2 id="extension-source-code"><a class="zola-anchor" href="#extension-source-code" aria-label="Anchor link for: extension-source-code">Extension Source Code</a></h2>
<p>The <code>score.c</code> file is a little long and most of it is uninteresting. The full <a href="https://github.com/hjr3/rust-php-ext/blob/master/php-ext/score.c">score.c</a> file is here. Let us explore the relevant portion where we create a PHP userland function named <code>score</code> to call our Rust <code>ext_score</code> function.</p>
<pre style="background-color:#2b303b;color:#c0c5ce;"><code><span>PHP_FUNCTION(score)
</span><span>{
</span><span> char *choice;
</span><span> int choice_len;
</span><span> char *query;
</span><span> int query_len;
</span><span>
</span><span> if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "ss", &choice, &choice_len, &query, &query_len) == FAILURE) {
</span><span> return;
</span><span> }
</span><span>
</span><span> double s = ext_score(choice, choice_len, query, query_len);
</span><span>
</span><span> RETURN_DOUBLE(s);
</span><span>}
</span></code></pre>
<p>We declare new PHP functions using the <code>PHP_FUNCTION</code> macro and pass it the name of the function. If you are using gdb and you want to break on this function, the macro transforms it into <code>zif_[func-name]</code>. In our case: <code>zif_score</code>. The <code>zif</code> stands for <em>Zend Interface Fucntion</em>. You will notice the word <em>Zend</em> being used a lot as that is the name of the PHP virutal machine (and the name of the company whose founders built the vm).</p>
<p>We are using the <code>zend_parse_parameters</code> function to parse the paramters being specified in our userland function. In this case, we are expecting two strings. If this function looks a little gnarly, well that is because it is. I will provide some links at the end that explain how this function works in more detail. Suffice to say, we get back two non-nul terminated <code>char *</code> values and their corresponding lengths as <code>int</code>s.</p>
<p>We can pass the strings into our <code>ext_score</code> function, get a result back and then return that value to userland PHP. We now have a working end-to-end PHP extension to a Rust library.</p>
<h2 id="further-reading"><a class="zola-anchor" href="#further-reading" aria-label="Anchor link for: further-reading">Further Reading</a></h2>
<p>For some detail on the PHP (or Zend) specific functions and macros:</p>
<ul>
<li><a href="http://devzone.zend.com/303/extension-writing-part-i-introduction-to-php-and-zend/">Extension Writing Part I: Introduction to PHP and Zend</a></li>
<li><a href="http://devzone.zend.com/317/extension-writing-part-ii-parameters-arrays-and-zvals/">Extension Writing Part II: Parameters, Arrays, and ZVALs</a></li>
</ul>
<p>If you are really serious about building PHP extensions, I suggest purchasing Sara Goleman's excellent book on <a href="http://www.amazon.com/Extending-Embedding-PHP-Sara-Golemon/dp/067232704X">Extending and Embedding PHP</a>.</p>
http://activitystrea.ms/schema/1.0/postCreating A Multi-echo Server using Rust and mio2015-07-22T00:00:00+00:002015-07-22T00:00:00+00:00Herman J. Radtke IIIhttp://activitystrea.ms/schema/1.0/personhttps://hermanradtke.com/about/https://hermanradtke.com/2015/07/22/creating-a-multi-echo-server-using-rust-and-mio.html/<p>This is my second blog post in a series about async IO. You may want to read <a href="/2015/07/12/my-basic-understanding-of-mio-and-async-io.html">first blog post</a> if you are not familar with mio or epoll/kqueue implementations.</p>
<h2 id="basic-setup"><a class="zola-anchor" href="#basic-setup" aria-label="Anchor link for: basic-setup">Basic Setup</a></h2>
<p>At the time of this writing, I am using the newly released mio <code>0.4.x</code>. Until recently if you got mio from crates.io, then you will get <code>0.3.x</code>. There are breaking changes between these two releases.</p>
<p>I have a <a href="https://github.com/hjr3/mob/blob/multi-echo-blog-post/src/main.rs">complete working example</a> that has a lot of comments in the source code. I am going to skip over a lot of detail and try to focus on handling a read event and then writing it to all connected clients. If I am making too large of a leap, open up the source to get some more context.</p>
<p>Our example will contain two main parts:</p>
<ol>
<li>A Server that handles events from our event loop and manages all connecitons.</li>
<li>A Connection that represents new client connections.</li>
</ol>
<p>The code does not use <em>unwrap</em>. I want to properly handle errors to get a feel for something written in mio that is closer to production ready. An error related to a Connection should reset that connection and never tear down the entire server. An error from the server, except during init, should cause a safe shutdown.</p>
<h2 id="server"><a class="zola-anchor" href="#server" aria-label="Anchor link for: server">Server</a></h2>
<p>Here is a quick overview of the <code>Server</code> struct.</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">struct </span><span>Server {
</span><span> </span><span style="color:#65737e;">// Listening socket for our server.
</span><span> </span><span style="color:#bf616a;">sock</span><span>: TcpListener,
</span><span>
</span><span> </span><span style="color:#65737e;">// We keep track of server token here instead of doing `const SERVER = Token(0)`.
</span><span> </span><span style="color:#bf616a;">token</span><span>: Token,
</span><span>
</span><span> </span><span style="color:#65737e;">// A list of connections _accepted_ by our server. This commonly referred to as the
</span><span> </span><span style="color:#65737e;">// _connection slab_.
</span><span> </span><span style="color:#bf616a;">conns</span><span>: Slab<Connection>,
</span><span>
</span><span>}
</span></code></pre>
<p>Our <code>Server</code> object will receive all the events from the event loop by implementing <code>mio::Handler</code>. A read event for the server token means a new client connection is coming in. We need to <em>accept</em> that new request, create a new <code>Connection</code> and add that connection object to our slab. A read event for any other token means we should already have established that connection. We need to forward the read event to that established connection.</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">impl </span><span>Handler </span><span style="color:#b48ead;">for </span><span>Server {
</span><span> </span><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">ready</span><span>(&</span><span style="color:#b48ead;">mut </span><span style="color:#bf616a;">self</span><span>, </span><span style="color:#bf616a;">event_loop</span><span>: &</span><span style="color:#b48ead;">mut </span><span>EventLoop<Server>, </span><span style="color:#bf616a;">token</span><span>: Token, </span><span style="color:#bf616a;">events</span><span>: EventSet) {
</span><span> </span><span style="color:#b48ead;">if</span><span> events.</span><span style="color:#96b5b4;">is_readable</span><span>() {
</span><span> </span><span style="color:#b48ead;">if </span><span style="color:#bf616a;">self</span><span>.token == token {
</span><span> </span><span style="color:#bf616a;">self</span><span>.</span><span style="color:#96b5b4;">accept</span><span>(event_loop);
</span><span> } </span><span style="color:#b48ead;">else </span><span>{
</span><span>
</span><span> </span><span style="color:#bf616a;">self</span><span>.</span><span style="color:#96b5b4;">readable</span><span>(event_loop, token)
</span><span> .</span><span style="color:#96b5b4;">and_then</span><span>(|_| </span><span style="color:#bf616a;">self</span><span>.</span><span style="color:#96b5b4;">find_connection_by_token</span><span>(token).</span><span style="color:#96b5b4;">reregister</span><span>(event_loop))
</span><span> .</span><span style="color:#96b5b4;">unwrap_or_else</span><span>(|</span><span style="color:#bf616a;">e</span><span>| {
</span><span> warn!("</span><span style="color:#a3be8c;">Read event failed for {:?}: {:?}</span><span>", token, e);
</span><span> </span><span style="color:#bf616a;">self</span><span>.</span><span style="color:#96b5b4;">reset_connection</span><span>(event_loop, token);
</span><span> });
</span><span> }
</span><span> }
</span><span> }
</span><span>}
</span></code></pre>
<p>Our accept function will add a new connection to the connection slab. <a href="https://crates.io/crates/slab">Slab</a> is described as a <em>Slab allocator for Rust</em>. I just recently <a href="https://twitter.com/hermanradtke/status/622863648273215488">discovered</a> where the term <em>slab allocator</em> came from. From what I have read about <code>Slab</code>, it allows us to use custom types as the index for an vector-like data structure. Within mio, the <code>Slab</code> type has been reexported as <code>pub type Slab<T> = ::slab::Slab<T, ::Token>;</code>. This means that the <code>Token</code> type will be the index and our <code>Connection</code> will be the value. Do not get confused, like I was, between the <code>Slab</code> type in the <em>slab</em> crate and the <code>Slab</code> type mio is reexporting.</p>
<p>Also, I will be using the <code>Server#find_connection_by_token</code> method all over the place. It is really just a thin wrapper to look up a connection with a given token: <code>self.conns[token]</code>.</p>
<p>Let us see the slab allocator in action:</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span> </span><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">accept</span><span>(&</span><span style="color:#b48ead;">mut </span><span style="color:#bf616a;">self</span><span>, </span><span style="color:#bf616a;">event_loop</span><span>: &</span><span style="color:#b48ead;">mut </span><span>EventLoop<Server>) {
</span><span>
</span><span> </span><span style="color:#b48ead;">let</span><span> sock = </span><span style="color:#65737e;">// ... skip some boilerplate about accepting a new socket connection
</span><span>
</span><span> </span><span style="color:#65737e;">// `Slab#insert_with` is a wrapper around `Slab#insert`. I like `#insert_with`
</span><span> </span><span style="color:#65737e;">// because I make the `Token` required for creating a new connection.
</span><span> </span><span style="color:#65737e;">//
</span><span> </span><span style="color:#65737e;">// `Slab#insert` returns the index where the connection was inserted.
</span><span> </span><span style="color:#65737e;">// Remember that in mio, the Slab is actually defined as
</span><span> </span><span style="color:#65737e;">// `pub type Slab<T> = ::slab::Slab<T, ::Token>;`. Token is just a
</span><span> </span><span style="color:#65737e;">// tuple struct around `usize` and Token implemented `::slab::Index`
</span><span> </span><span style="color:#65737e;">// trait. So, every insert into the connection slab will return a new
</span><span> </span><span style="color:#65737e;">// token needed to register with the event loop. Fancy...
</span><span> </span><span style="color:#b48ead;">match </span><span style="color:#bf616a;">self</span><span>.conns.</span><span style="color:#96b5b4;">insert_with</span><span>(|</span><span style="color:#bf616a;">token</span><span>| {
</span><span> debug!("</span><span style="color:#a3be8c;">registering {:?} with event loop</span><span>", token);
</span><span> Connection::new(sock, token)
</span><span> }) {
</span><span> Some(token) => {
</span><span> </span><span style="color:#65737e;">// If we successfully insert, then register our connection.
</span><span> </span><span style="color:#b48ead;">match </span><span style="color:#bf616a;">self</span><span>.</span><span style="color:#96b5b4;">find_connection_by_token</span><span>(token).</span><span style="color:#96b5b4;">register</span><span>(event_loop) {
</span><span> Ok(_) => {},
</span><span> Err(e) => {
</span><span> error!("</span><span style="color:#a3be8c;">Failed to register {:?} connection with event loop, {:?}</span><span>", token, e);
</span><span> </span><span style="color:#bf616a;">self</span><span>.conns.</span><span style="color:#96b5b4;">remove</span><span>(token);
</span><span> }
</span><span> }
</span><span> },
</span><span> None => {
</span><span> </span><span style="color:#65737e;">// If we fail to insert, `conn` will go out of scope and be dropped.
</span><span> error!("</span><span style="color:#a3be8c;">Failed to insert connection into slab</span><span>");
</span><span> }
</span><span> };
</span><span>
</span><span> </span><span style="color:#65737e;">// We are using edge-triggered polling. Even our SERVER token needs to reregister.
</span><span> </span><span style="color:#bf616a;">self</span><span>.</span><span style="color:#96b5b4;">reregister</span><span>(event_loop);
</span><span> }
</span></code></pre>
<p>Established connections are forwarded to <code>Server#readable</code>. Connections are identified by the token provided to us from the event loop. Once a read has finished, push the receive buffer into the all the existing connections so we can echo it back to all the connections (remember, this is a multi-echo server).</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span> </span><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">readable</span><span>(&</span><span style="color:#b48ead;">mut </span><span style="color:#bf616a;">self</span><span>, </span><span style="color:#bf616a;">event_loop</span><span>: &</span><span style="color:#b48ead;">mut </span><span>EventLoop<Server>, </span><span style="color:#bf616a;">token</span><span>: Token) -> io::Result<()> {
</span><span> debug!("</span><span style="color:#a3be8c;">server conn readable; token={:?}</span><span>", token);
</span><span> </span><span style="color:#b48ead;">let</span><span> message = try!(</span><span style="color:#bf616a;">self</span><span>.</span><span style="color:#96b5b4;">find_connection_by_token</span><span>(token).</span><span style="color:#96b5b4;">readable</span><span>());
</span><span>
</span><span> </span><span style="color:#b48ead;">if</span><span> message.</span><span style="color:#96b5b4;">remaining</span><span>() == message.</span><span style="color:#96b5b4;">capacity</span><span>() { </span><span style="color:#65737e;">// is_empty
</span><span> </span><span style="color:#b48ead;">return </span><span>Ok(());
</span><span> }
</span><span>
</span><span> </span><span style="color:#b48ead;">let mut</span><span> bad_tokens = Vec::new();
</span><span>
</span><span> </span><span style="color:#65737e;">// Queue up a write for all connected clients.
</span><span> </span><span style="color:#b48ead;">for</span><span> conn in </span><span style="color:#bf616a;">self</span><span>.conns.</span><span style="color:#96b5b4;">iter_mut</span><span>() {
</span><span> </span><span style="color:#b48ead;">let</span><span> conn_send_buf = ByteBuf::from_slice(message.</span><span style="color:#96b5b4;">bytes</span><span>());
</span><span> conn.</span><span style="color:#96b5b4;">send_message</span><span>(conn_send_buf)
</span><span> .</span><span style="color:#96b5b4;">and_then</span><span>(|_| conn.</span><span style="color:#96b5b4;">reregister</span><span>(event_loop))
</span><span> .</span><span style="color:#96b5b4;">unwrap_or_else</span><span>(|</span><span style="color:#bf616a;">e</span><span>| {
</span><span> error!("</span><span style="color:#a3be8c;">Failed to queue message for {:?}: {:?}</span><span>", conn.token, e);
</span><span> </span><span style="color:#65737e;">// We have a mutable borrow for the connection, so we cannot
</span><span> </span><span style="color:#65737e;">// remove until the loop is finished
</span><span> bad_tokens.</span><span style="color:#96b5b4;">push</span><span>(conn.token)
</span><span> });
</span><span> }
</span><span>
</span><span> </span><span style="color:#b48ead;">for</span><span> t in bad_tokens {
</span><span> </span><span style="color:#bf616a;">self</span><span>.</span><span style="color:#96b5b4;">reset_connection</span><span>(event_loop, t);
</span><span> }
</span><span>
</span><span> Ok(())
</span><span> }
</span></code></pre>
<h2 id="connection"><a class="zola-anchor" href="#connection" aria-label="Anchor link for: connection">Connection</a></h2>
<p>The <code>Connection</code> object represents a client connection. This looks similar to <code>Server</code>, with a few differences. I need keep track of what events we are interested in. By default, the connection is always interested in a read event. Only when we push messages into the <code>send_queue</code> will the connection be interested in a write event.</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">struct </span><span>Connection {
</span><span> </span><span style="color:#65737e;">// handle to the accepted socket
</span><span> </span><span style="color:#bf616a;">sock</span><span>: TcpStream,
</span><span>
</span><span> </span><span style="color:#65737e;">// token used to register with the event loop
</span><span> </span><span style="color:#bf616a;">token</span><span>: Token,
</span><span>
</span><span> </span><span style="color:#65737e;">// set of events we are interested in
</span><span> </span><span style="color:#bf616a;">interest</span><span>: EventSet,
</span><span>
</span><span> </span><span style="color:#65737e;">// messages waiting to be sent out
</span><span> </span><span style="color:#bf616a;">send_queue</span><span>: Vec<ByteBuf>,
</span><span>}
</span></code></pre>
<p>We are using <code>MutByteBuf</code> to read data from the socket. MutByteBuf, part of the <a href="https://crates.io/crates/bytes">bytes crate</a>, is a heap allocated slice that mio supports internally. I prefer to use this as it does the work of tracking how much of our slice has been used. I chose a capacity of 2048 after reading <a href="https://github.com/carllerche/mio/blob/eed4855c627892b88f7ca68d3283cbc708a1c2b3/src/io.rs#L23-27">some mio source code</a> as that seems like a good size of streaming. If you are wondering what the difference between messaged based and continuous streaming read the answer to this <a href="http://stackoverflow.com/questions/3017633/difference-between-message-oriented-protocols-and-stream-oriented-protocols">StackOverflow question</a>. TLDR: UDP vs TCP. We are using TCP.</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span> </span><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">readable</span><span>(&</span><span style="color:#b48ead;">mut </span><span style="color:#bf616a;">self</span><span>) -> io::Result<ByteBuf> {
</span><span>
</span><span> </span><span style="color:#b48ead;">let mut</span><span> recv_buf = ByteBuf::mut_with_capacity(</span><span style="color:#d08770;">2048</span><span>);
</span><span>
</span><span> </span><span style="color:#65737e;">// we are PollOpt::edge() and PollOpt::oneshot(), so we _must_ drain
</span><span> </span><span style="color:#65737e;">// the entire socket receive buffer, otherwise the server will hang.
</span><span> </span><span style="color:#b48ead;">loop </span><span>{
</span><span> </span><span style="color:#b48ead;">match </span><span style="color:#bf616a;">self</span><span>.sock.</span><span style="color:#96b5b4;">try_read_buf</span><span>(&</span><span style="color:#b48ead;">mut</span><span> recv_buf) {
</span><span> </span><span style="color:#65737e;">// the socket receive buffer is empty, so let's move on
</span><span> </span><span style="color:#65737e;">// try_read_buf internally handles WouldBLock here too
</span><span> Ok(None) => {
</span><span> debug!("</span><span style="color:#a3be8c;">CONN : we read 0 bytes</span><span>");
</span><span> </span><span style="color:#b48ead;">break</span><span>;
</span><span> },
</span><span> Ok(Some(n)) => {
</span><span> debug!("</span><span style="color:#a3be8c;">CONN : we read {} bytes</span><span>", n);
</span><span>
</span><span> </span><span style="color:#65737e;">// if we read less than capacity, then we know the
</span><span> </span><span style="color:#65737e;">// socket is empty and we should stop reading. if we
</span><span> </span><span style="color:#65737e;">// read to full capacity, we need to keep reading so we
</span><span> </span><span style="color:#65737e;">// can drain the socket. if the client sent exactly capacity,
</span><span> </span><span style="color:#65737e;">// we will match the arm above. the recieve buffer will be
</span><span> </span><span style="color:#65737e;">// full, so extra bytes are being dropped on the floor. to
</span><span> </span><span style="color:#65737e;">// properly handle this, i would need to push the data into
</span><span> </span><span style="color:#65737e;">// a growable Vec<u8>.
</span><span> </span><span style="color:#b48ead;">if</span><span> n < recv_buf.</span><span style="color:#96b5b4;">capacity</span><span>() {
</span><span> </span><span style="color:#b48ead;">break</span><span>;
</span><span> }
</span><span> },
</span><span> Err(e) => {
</span><span> error!("</span><span style="color:#a3be8c;">Failed to read buffer for token {:?}, error: {}</span><span>", </span><span style="color:#bf616a;">self</span><span>.token, e);
</span><span> </span><span style="color:#b48ead;">return </span><span>Err(e);
</span><span> }
</span><span> }
</span><span> }
</span><span>
</span><span> </span><span style="color:#65737e;">// change our type from MutByteBuf to ByteBuf so we can use it to
</span><span> </span><span style="color:#65737e;">// write
</span><span> Ok(recv_buf.</span><span style="color:#96b5b4;">flip</span><span>())
</span><span> }
</span></code></pre>
<p>The result of the read is pushed into all the existing connections write queue by <code>Server#readble</code> (we went over this function above). The last thing to do is to then write this message back to the client. The <code>try_write_buf</code> method is similar to the <code>try_read_buf</code> method we used above except that it expects a <code>ByteBuf</code>. I chose to only write one buffer from the queue to the client per write event. If there are still buffers in the queue, we remainig interested in writable events. If queue is empty, then we are no longer interested in write events.</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span> </span><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">writable</span><span>(&</span><span style="color:#b48ead;">mut </span><span style="color:#bf616a;">self</span><span>) -> io::Result<()> {
</span><span>
</span><span> try!(</span><span style="color:#bf616a;">self</span><span>.send_queue.</span><span style="color:#96b5b4;">pop</span><span>()
</span><span> .</span><span style="color:#96b5b4;">ok_or</span><span>(Error::new(ErrorKind::Other, "</span><span style="color:#a3be8c;">Could not pop send queue</span><span>"))
</span><span> .</span><span style="color:#96b5b4;">and_then</span><span>(|</span><span style="color:#b48ead;">mut </span><span style="color:#bf616a;">buf</span><span>| {
</span><span> </span><span style="color:#b48ead;">match </span><span style="color:#bf616a;">self</span><span>.sock.</span><span style="color:#96b5b4;">try_write_buf</span><span>(&</span><span style="color:#b48ead;">mut</span><span> buf) {
</span><span> Ok(None) => {
</span><span> debug!("</span><span style="color:#a3be8c;">client flushing buf; WouldBlock</span><span>");
</span><span>
</span><span> </span><span style="color:#65737e;">// put message back into the queue so we can try again
</span><span> </span><span style="color:#bf616a;">self</span><span>.send_queue.</span><span style="color:#96b5b4;">push</span><span>(buf);
</span><span> Ok(())
</span><span> },
</span><span> Ok(Some(n)) => {
</span><span> debug!("</span><span style="color:#a3be8c;">CONN : we wrote {} bytes</span><span>", n);
</span><span> Ok(())
</span><span> },
</span><span> Err(e) => {
</span><span> error!("</span><span style="color:#a3be8c;">Failed to send buffer for {:?}, error: {}</span><span>", </span><span style="color:#bf616a;">self</span><span>.token, e);
</span><span> Err(e)
</span><span> }
</span><span> }
</span><span> })
</span><span> );
</span><span>
</span><span> </span><span style="color:#b48ead;">if </span><span style="color:#bf616a;">self</span><span>.send_queue.</span><span style="color:#96b5b4;">is_empty</span><span>() {
</span><span> </span><span style="color:#bf616a;">self</span><span>.interest.</span><span style="color:#96b5b4;">remove</span><span>(EventSet::writable());
</span><span> }
</span><span>
</span><span> Ok(())
</span><span> }
</span></code></pre>
<p>I am just getting into async io and mio, so my implementation may not be ideal, but it works. We have a functioning multi-echo server that is resistant to errors. The source also contains a simple client that will repeatedly write a message to the server and then read a message.</p>
<p>One thing that this code does not do well is handle reads from a client. In order to do that well, we need to establish a simple <em>protocol</em>. I am working through that now and will go over that in my <a href="/2015/09/12/creating-a-simple-protocol-when-using-rust-and-mio.html">next post</a>.</p>
http://activitystrea.ms/schema/1.0/postThe _with Function Pattern in Rust2015-07-14T00:00:00+00:002015-07-14T00:00:00+00:00Herman J. Radtke IIIhttp://activitystrea.ms/schema/1.0/personhttps://hermanradtke.com/about/https://hermanradtke.com/2015/07/14/the_with_function_pattern_in_rust.html/<p>I really like _with style functions that accept a <code>FnOnce</code> callback. The scoping rules work out really well when using these functions. I was working with the <a href="https://github.com/carllerche/slab">slab</a> crate recently and used the <a href="https://github.com/carllerche/slab/blob/master/src/lib.rs#L142">Slab#insert_with</a> function. This function takes a callback where an object is supposed to be allocated before being inserted into the slab. The function returns an <code>Option</code> type. I was trying to figure out to <em>drop</em> the newly created object if the function returned <code>None</code> (meaning the insertion failed). After a few minutes it dawned on me that the object was out already dropped!</p>
<p>Example:</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">extern crate</span><span> slab;
</span><span>
</span><span>#[</span><span style="color:#bf616a;">derive</span><span>(Debug)]
</span><span style="color:#b48ead;">struct </span><span>MyType {
</span><span> </span><span style="color:#bf616a;">index</span><span>: </span><span style="color:#b48ead;">usize</span><span>,
</span><span> </span><span style="color:#bf616a;">value</span><span>: String
</span><span>}
</span><span>
</span><span style="color:#b48ead;">type </span><span>Slab = ::slab::Slab<MyType, </span><span style="color:#b48ead;">usize</span><span>>;
</span><span>
</span><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">main</span><span>() {
</span><span>
</span><span> </span><span style="color:#b48ead;">let mut</span><span> slab: Slab = Slab::new(</span><span style="color:#d08770;">128</span><span>);
</span><span>
</span><span> </span><span style="color:#b48ead;">let </span><span style="color:#8fa1b3;">f </span><span>= |</span><span style="color:#bf616a;">index</span><span>: </span><span style="color:#b48ead;">usize</span><span>| -> MyType {
</span><span> MyType {
</span><span> index: index,
</span><span> value: "</span><span style="color:#a3be8c;">a very very very long string</span><span>".</span><span style="color:#96b5b4;">to_string</span><span>()
</span><span> }
</span><span> };
</span><span>
</span><span> </span><span style="color:#b48ead;">match</span><span> slab.</span><span style="color:#96b5b4;">insert_with</span><span>(f) {
</span><span> Some(index) => {
</span><span> println!("</span><span style="color:#a3be8c;">Inserted MyType at index </span><span style="color:#d08770;">{}</span><span>", index);
</span><span> },
</span><span> None => {
</span><span> </span><span style="color:#65737e;">// If insertion fails, `MyType` will go out of scope and be dropped/freed.
</span><span> println!("</span><span style="color:#a3be8c;">Failed to insert into slab</span><span>");
</span><span> }
</span><span> }
</span><span>}
</span></code></pre>
<p>The newly allocated <code>MyType</code> is <em>moved</em> from the callback into the <code>Slab#insert_with</code> scope. If the insert fails, then <code>Slab#insert_with</code> returns <code>None</code>. The newly allocated type is left within the <code>Slab::insert_with</code> function scope. Once <code>Slab#insert_with</code> returns, the newly allocated type will be automatically dropped. When an object is dropped, the destructor is called and any allocated memory will be freed.</p>
<p>[edit: An explanation of drop semantics can be found <a href="https://github.com/rust-lang/rfcs/blob/master/text/0320-nonzeroing-dynamic-drop.md#appendices">here</a>.]</p>
<p>The slab crate is an elegant little library that allocates a chunk of memory on the heap and stores values using a custom type for the index. It incorporates a lot of the core Rust concepts. I am learning a lot by studying the code.</p>
http://activitystrea.ms/schema/1.0/postMy Basic Understanding of mio and Asynchronous IO2015-07-12T00:00:00+00:002015-07-12T00:00:00+00:00Herman J. Radtke IIIhttp://activitystrea.ms/schema/1.0/personhttps://hermanradtke.com/about/https://hermanradtke.com/2015/07/12/my-basic-understanding-of-mio-and-async-io.html/<p>I needed async IO for a Rust project I was working on. My server needs to read some bytes from a client and then send those bytes back to all registered clients. I decided to use <a href="https://github.com/carllerche/mio">mio</a>, the Rust async IO library, to build a server. All the examples I found showcased reading from a socket and then writing back to that same socket. Also, many of the examples had caveats about unhandled edge cases and used a lot of <code>unwrap</code>. Over the next three posts, I am going to walk through everything I learned about async (or evented) IO as it relates to mio, how I setup my server and talk in depth about some of the ways I thought about handling errors. Let us start with the an overview of how async IO works within the context of mio.</p>
<h2 id="how-mio-exposes-async-io"><a class="zola-anchor" href="#how-mio-exposes-async-io" aria-label="Anchor link for: how-mio-exposes-async-io">How mio Exposes Async IO</a></h2>
<p>I initially thought the mio library was a Rust wrapper around <a href="http://software.schmorp.de/pkg/libev.html">libev</a>. To my surprise, I realized mio is a replacement for libev. The name mio (metal IO) starts to make a lot more sense. The mio library is interfacing directly with <a href="http://man7.org/linux/man-pages/man7/epoll.7.html">epoll</a> if you are on linux and <a href="https://developer.apple.com/library/mac/documentation/Darwin/Reference/ManPages/man2/kqueue.2.html">kqueue</a> if you are on FreeBSD (or OS X). The mio event loop interface expects us to know how the epoll/kqueue implementations work. Mio gives us complete control over how the async IO works, but it also means there is quite a bit to learn. After a decent amount of a reading, I believe I have a basic understanding of what some of the affordances mio exposes actually do.</p>
<h3 id="registration"><a class="zola-anchor" href="#registration" aria-label="Anchor link for: registration">Registration</a></h3>
<p>The <code>EventLoop</code> object provided by mio is our main point of contact. Interaction with the event loop is in the form of the <code>register</code>, <code>register_opt</code>, <code>reregister</code> and <code>deregister</code> functions. These functions allow our code to control how the event loop interacts with the incoming client connections. All the functions, with the exception of deregister, require four arguments: a <code>TcpStream</code> socket for our client connection, a <code>Token</code> to identify connections, an <code>EventSet</code> to control what events we are notified of and a <code>PollOpt</code> to determine how we should be notified. The deregister function only needs the socket.</p>
<p>Understanding how these functions work are key to understanding how to use mio. The concept of registering with an event loop is fairly straight forward, especially once you start looking at the code. It is the arguments to these registration functions that are more dense. I am going to assume you understand the basics of how a socket works and how to read and write bytes using that socket. The other arguments require some more in-depth discussion.</p>
<h3 id="tokens"><a class="zola-anchor" href="#tokens" aria-label="Anchor link for: tokens">Tokens</a></h3>
<p>I found the use of tokens strange at first. I am more familiar with the use of callbacks to deal with asynchronous events. Mio uses tokens as an alternative to callbacks in order to achieve the design goal of zero allocations at runtime. The <code>Token</code> type is really just a <a href="https://doc.rust-lang.org/nightly/book/structs.html#tuple-structs">tuple struct</a> wrapper around <code>usize</code>. This means it is cheap to compare and copy. This will be important later when we start using the <code>Token</code> type.</p>
<p>A <code>Token</code> is used to identify the state related to a connected socket. We register with the event loop using a token. Later on, the event loop will specify this token when notifying us of an event. A feedback loop of sorts is created. The <code>Token</code> is stored, along with the connection state, in the <em>connection slab</em>. I am going to discuss the connection slab when we start looking at the code. Trying to explain it without code feels overly complicated.</p>
<h3 id="eventset-formerly-known-as-interest"><a class="zola-anchor" href="#eventset-formerly-known-as-interest" aria-label="Anchor link for: eventset-formerly-known-as-interest">EventSet (Formerly Known As Interest)</a></h3>
<p>The <code>EventSet</code> object represents the set of events we are interested in being notified of. Until recently, the <code>EventSet</code> type was name <code>Interest</code>. The <code>0.3.x</code> branch still refers to it as <code>Interest</code>. There are four types of events:</p>
<ul>
<li>readable - Tells the event loop we want to read data from a client connection.</li>
<li>writable - Tells the event loop we want to write data to a client connection.</li>
<li>hup - Tells the event loop that we want to be notified when a client closes the connection (hangs up).</li>
<li>error - Tells the event loop that we want to listen for errors.</li>
</ul>
<p>If you are curious as to why <code>Interest</code> was renamed to <code>EventSet</code>, you can read the <a href="https://github.com/carllerche/mio/issues/184">full discussion</a> that ultimately resulted in the change. Essentially, epoll and kqueue have slightly different interfaces and this change made it easier for people using the mio library to handle those differences. The <code>EventSet</code> type removed the notion of a <em>read hint</em> that is present in many mio examples currently out there.</p>
<h4 id="write-notifications"><a class="zola-anchor" href="#write-notifications" aria-label="Anchor link for: write-notifications">Write Notifications</a></h4>
<p>One thing that confused me a lot was <em>when</em> to use writable. A lot of the examples are just echoing back what the client sent them or they are performing some very simple task. In these cases, you do not need to register <code>EventSet::writable()</code> if you want to immediately write back to the socket you just read from. You can just perform the write as part of the current <em>readable</em> event. If you are performing an expensive task between the read and write, you may want to handle this differently. Whether you are reading or writing from the socket, you also need to be aware that the kernel might not be ready for your read or write.</p>
<h4 id="i-would-block-you"><a class="zola-anchor" href="#i-would-block-you" aria-label="Anchor link for: i-would-block-you">I Would Block You</a></h4>
<p>[edit: I updated this section to replace the word <em>block</em> with more correct language like <em>reject</em> and <em>not ready</em>.]</p>
<p>Even though we are using asynchronous IO, the kernel is not always ready for our reads and writes. When the kernel's internal send or receive buffers are full and it needs to flush them we will be asked to try again. Typically, the kernel communicates <em>try again</em> to us in the form of an error. In Rust, we have <code>std::io::ErrorKind::WouldBlock</code>. In C, it is referred to as <code>EAGAIN</code>. This error is the kernel's way of letting us know it is not ready for our read or write and they we need to try again. This <code>WouldBlock</code> error <em>must</em> be handled. In mio, we are provided with the traits, <code>TryRead</code> and <code>TryWrite</code>, which catch <code>WouldBlock</code> and treat it as a 0 byte read. These traits are convenient to use as our error handling can now assume any <code>Err(_)</code> is an unexpected error. More about this when we get to some code samples.</p>
<p>You might be wondering what it means to <em>try again</em> when the kernel is not ready for our read or write. In order to understand this, we need to understand the polling options.</p>
<h3 id="poll-options"><a class="zola-anchor" href="#poll-options" aria-label="Anchor link for: poll-options">Poll Options</a></h3>
<p>The <code>PollOpts</code> exposed by mio really tripped me up at first because I did not understand how epoll/kqueue worked at all. There are basically two different polling options, or <em>triggers</em>, we can use. By default, mio will specify <code>PollOpt::level()</code> when registering with the event loop. Level-triggered polling is what you would expect from a straight-forward polling implementation. If you are familiar with <code>select()</code> in C, this is basically the same thing. The downside to level-triggered polling is that we are expected to handle the events immediately. If we do not handle them immediately, then the event loop will notify us constantly of the event and we end up wasting resources.</p>
<p>What most people opt for is edge-triggered, <code>PollOpt::edge()</code>, polling. Edge-triggred polling means that when we receive a read or write event, the event loop will automatically deregister our connection. This means we can get notified of an event and then have the option of handling it now or later. If more events come in for that connection, the event loop will queue those up for us until we register again. This requires us to have to manage the state of our connections, but gives us the flexibility we really want.</p>
<p>We can also combine edge-triggered polling with another option: <code>PollOpt::oneshot()</code>. Not only does this option sound super cool, it also guarantees that only one thread will be woken up. This allows us to be thread-safe when reading or writing. Thread safely unlocks the ability for allows us to write multi-threaded epoll processes on top of mio. For my server, I decided to register connections using <code>PollOpt::edge() | PollOpt::oneshot()</code> when registering with the event loop.</p>
<h4 id="trying-again"><a class="zola-anchor" href="#trying-again" aria-label="Anchor link for: trying-again">Trying Again</a></h4>
<p>Now that we are familiar with what events we can be notified of and what our polling options are, we need to revisit the notion of trying our read or write again when the kernel not ready for us. Using edge-triggering, a read or write event means our connection will be deregistered from the event loop. To try again, we need to first save our work and then reregister our connection with the event loop, using our token, so we can be notified after the kernel is done flushing.</p>
<h2 id="next-steps"><a class="zola-anchor" href="#next-steps" aria-label="Anchor link for: next-steps">Next Steps</a></h2>
<p>We now have the necessary context to start using mio. It took days for these concepts to really sink in with me. If you grok this already, you are awesome. If not, give it time! I am going to apply these above concepts to actual code in my <a href="/2015/07/22/creating-a-multi-echo-server-using-rust-and-mio.html">next post</a>. If you want to get started before my next post, I would start with <a href="https://github.com/carllerche/mio/blob/master/test/test_echo_server.rs">test echo server</a> that is part of the mio test suite. There is also the <a href="https://github.com/carllerche/mio/blob/docs/doc/getting-started.md">getting started</a> documentation that mio provides, though it is somewhat out of date for the <code>0.4.x</code> branch.</p>
<p>There are also a few projects that are abstracting a lot of the details needed to get mio working. These can be great example to learn from. The two I have looked at are:</p>
<ul>
<li><a href="https://github.com/rrichardson/reactor">Reactor</a> - Evented polling + network utilities to make life easier </li>
<li><a href="https://github.com/dpc/mioco">mioco</a> - Allows handling mio connections inside coroutines </li>
</ul>
<h2 id="sources"><a class="zola-anchor" href="#sources" aria-label="Anchor link for: sources">Sources</a></h2>
<p>In addition to reading the mio soure code and example code, I did a lot of reading about epoll itself. Here is a list of some sources I used to get more familiar with epoll/kqueue:</p>
<ul>
<li><a href="http://stackoverflow.com/a/13568962/775246">Overview of epoll options</a> - StackOverflow post on the differences between level-triggered and edge-triggered polling</li>
<li><a href="http://stackoverflow.com/a/9162805/775246">Purpose of edge-triggered polling</a> - StackOverflow post describing the real advantage of edge-triggered polling</li>
<li><a href="https://banu.com/blog/2/how-to-use-epoll-a-complete-example-in-c/">Complete epoll example in C</a> - A blog post walking through an epoll implementation in C. This is great if you are familar with C and not quite as comfortable with Rust.</li>
<li><a href="https://raw.githubusercontent.com/dankamongmen/libtorque/master/doc/mteventqueues">Event Queues and Threads</a> - Detailed document primarily describing Linux's epoll(7) I/O event notification facility as of the 2.6 kernel series.</li>
</ul>
http://activitystrea.ms/schema/1.0/postEffectively Using Iterators In Rust2015-06-22T00:00:00+00:002015-06-22T00:00:00+00:00Herman J. Radtke IIIhttp://activitystrea.ms/schema/1.0/personhttps://hermanradtke.com/about/https://hermanradtke.com/2015/06/22/effectively-using-iterators-in-rust.html/<p>In Rust, you quickly learn that vector and slice types are not iterable themselves. Depending on which tutorial or example you see first, you call <code>.iter()</code> or <code>.into_iter()</code>. If you do not realize both of these functions exist or that they do different things, you may find yourself fighting with the compiler to get your code to work. Let us take a journey through the world of iterators and figure out the differences between iter() and into_iter() in Rust.</p>
<h2 id="iter"><a class="zola-anchor" href="#iter" aria-label="Anchor link for: iter">Iter</a></h2>
<p>Most examples I have found use <code>.iter()</code>. We can call <code>v.iter()</code> on something like a vector or slice. This creates an <code>Iter<'a, T></code> type and it is this <code>Iter<'a, T></code> type that implements the <code>Iterator</code> trait and allows us to call functions like <code>.map()</code>. It is important to note that this <code>Iter<'a, T></code> type only has a reference to <code>T</code>. This means that calling <code>v.iter()</code> will create a struct that <em>borrows</em> from <code>v</code>. Use the <code>iter()</code> function if you want to iterate over the values by <em>reference</em>.</p>
<p>Let us write a simple map/reduce example:</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">use_names_for_something_else</span><span>(</span><span style="color:#bf616a;">_names</span><span>: Vec<&</span><span style="color:#b48ead;">str</span><span>>) {
</span><span>}
</span><span>
</span><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">main</span><span>() {
</span><span> </span><span style="color:#b48ead;">let</span><span> names = vec!["</span><span style="color:#a3be8c;">Jane</span><span>", "</span><span style="color:#a3be8c;">Jill</span><span>", "</span><span style="color:#a3be8c;">Jack</span><span>", "</span><span style="color:#a3be8c;">John</span><span>"];
</span><span>
</span><span> </span><span style="color:#b48ead;">let</span><span> total_bytes = names
</span><span> .</span><span style="color:#96b5b4;">iter</span><span>()
</span><span> .</span><span style="color:#96b5b4;">map</span><span>(|</span><span style="color:#bf616a;">name</span><span>: &&</span><span style="color:#b48ead;">str</span><span>| name.</span><span style="color:#96b5b4;">len</span><span>())
</span><span> .</span><span style="color:#96b5b4;">fold</span><span>(</span><span style="color:#d08770;">0</span><span>, |</span><span style="color:#bf616a;">acc</span><span>, </span><span style="color:#bf616a;">len</span><span>| acc + len );
</span><span>
</span><span> assert_eq!(total_bytes, </span><span style="color:#d08770;">16</span><span>);
</span><span> </span><span style="color:#96b5b4;">use_names_for_something_else</span><span>(names);
</span><span>}
</span></code></pre>
<p>In this example, we are using <code>.map()</code> and <code>.fold()</code> to count the number of bytes (not characters! Rust strings are UTF-8) for all strings in the <code>names</code> vector. We <a href="/2015/06/09/strategies-for-solving-cannot-move-out-of-borrowing-errors-in-rust.html">know</a> that the <code>len()</code> function can use an immutable reference. As such, we prefer <code>iter()</code> instead of <code>iter_mut()</code> or <code>into_iter()</code>. This allows us to <em>move</em> the <code>names</code> vector later if we want. I put a bogus <code>use_names_for_something()</code> function in the example just to prove this. If we had used <code>into_iter()</code> instead, the compiler would have given us an <em>error: use of moved value: <code>names</code></em> response.</p>
<p>The closure used in <code>map()</code> does not require the <code>name</code> parameter to have a type, but I specified the type to show how it is being passed as a reference. Notice that the type of name is <code>&&str</code> and not <code>&str</code>. The string <code>"Jane"</code> is of type <code>&str</code>. The <code>iter()</code> function creates an iterator that has a <em>reference</em> to each element in the <code>names</code> vector. Thus, we have a <em>reference</em> to a <em>reference</em> of a string slice. This can get a little unwieldy and I generally do not worry about the type. However, if we are destructuring the type, we do need to specify the reference:</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">main</span><span>() {
</span><span> </span><span style="color:#b48ead;">let</span><span> player_scores = [
</span><span> ("</span><span style="color:#a3be8c;">Jack</span><span>", </span><span style="color:#d08770;">20</span><span>), ("</span><span style="color:#a3be8c;">Jane</span><span>", </span><span style="color:#d08770;">23</span><span>), ("</span><span style="color:#a3be8c;">Jill</span><span>", </span><span style="color:#d08770;">18</span><span>), ("</span><span style="color:#a3be8c;">John</span><span>", </span><span style="color:#d08770;">19</span><span>),
</span><span> ];
</span><span>
</span><span> </span><span style="color:#b48ead;">let</span><span> players = player_scores
</span><span> .</span><span style="color:#96b5b4;">iter</span><span>()
</span><span> .</span><span style="color:#96b5b4;">map</span><span>(|(</span><span style="color:#bf616a;">player</span><span>, </span><span style="color:#bf616a;">_score</span><span>)| {
</span><span> player
</span><span> })
</span><span> .collect::<Vec<_>>();
</span><span>
</span><span> assert_eq!(players, ["</span><span style="color:#a3be8c;">Jack</span><span>", "</span><span style="color:#a3be8c;">Jane</span><span>", "</span><span style="color:#a3be8c;">Jill</span><span>", "</span><span style="color:#a3be8c;">John</span><span>"]);
</span><span>}
</span></code></pre>
<p>In the above example, the compiler will complain that we are specifying the type <code>(_, _)</code> instead of <code>&(_, _)</code>. Changing the pattern to <code>&(player, _score)</code> will satisfy the compiler.</p>
<p>Rust is immutable by default and iterators make it easy to manipulate data without needing mutability. If you do find yourself wanting to mutate some data, you can use the <code>iter_mut()</code> method to get a mutable reference to the values. Example use of <code>iter_mut()</code>:</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">main</span><span>() {
</span><span> </span><span style="color:#b48ead;">let mut</span><span> teams = [
</span><span> [ ("</span><span style="color:#a3be8c;">Jack</span><span>", </span><span style="color:#d08770;">20</span><span>), ("</span><span style="color:#a3be8c;">Jane</span><span>", </span><span style="color:#d08770;">23</span><span>), ("</span><span style="color:#a3be8c;">Jill</span><span>", </span><span style="color:#d08770;">18</span><span>), ("</span><span style="color:#a3be8c;">John</span><span>", </span><span style="color:#d08770;">19</span><span>), ],
</span><span> [ ("</span><span style="color:#a3be8c;">Bill</span><span>", </span><span style="color:#d08770;">17</span><span>), ("</span><span style="color:#a3be8c;">Brenda</span><span>", </span><span style="color:#d08770;">16</span><span>), ("</span><span style="color:#a3be8c;">Brad</span><span>", </span><span style="color:#d08770;">18</span><span>), ("</span><span style="color:#a3be8c;">Barbara</span><span>", </span><span style="color:#d08770;">17</span><span>), ]
</span><span> ];
</span><span>
</span><span> </span><span style="color:#b48ead;">let</span><span> teams_in_score_order = teams
</span><span> .</span><span style="color:#96b5b4;">iter_mut</span><span>()
</span><span> .</span><span style="color:#96b5b4;">map</span><span>(|</span><span style="color:#bf616a;">team</span><span>| {
</span><span> team.</span><span style="color:#96b5b4;">sort_by</span><span>(|&</span><span style="color:#bf616a;">a</span><span>, &</span><span style="color:#bf616a;">b</span><span>| a.</span><span style="color:#d08770;">1.</span><span style="color:#96b5b4;">cmp</span><span>(&b.</span><span style="color:#d08770;">1</span><span>).</span><span style="color:#96b5b4;">reverse</span><span>());
</span><span> team
</span><span> })
</span><span> .collect::<Vec<_>>();
</span><span>
</span><span> println!("</span><span style="color:#a3be8c;">Teams: </span><span style="color:#d08770;">{:?}</span><span>", teams_in_score_order);
</span><span>}
</span></code></pre>
<p>Here we are using a mutable reference to sort the list of players on each team by highest score. The <code>sort_by()</code> function performs the sorting of the Vector/slice in place. This means we need the ability to mutate <code>team</code> in order to sort. I do not use <code>.iter_mut()</code> often, but sometimes functions like <code>.sort_by()</code> provide no immutable alternative. </p>
<p>I tend to use <code>.iter()</code> most. I try to be very concious and deliberate about when I <em>move</em> resources and default to borrowing (or referencing) first. The reference created by <code>.iter()</code> is short-lived, so we can <em>move</em> or use our original value afterwards. If you find yourself running into <em>does not live long enough</em>, <em>move</em> errors or using the <code>.clone()</code> function, this is a sign that you probably want to use <code>.into_iter()</code> instead.</p>
<h2 id="intoiter"><a class="zola-anchor" href="#intoiter" aria-label="Anchor link for: intoiter">IntoIter</a></h2>
<p>Use the <code>into_iter()</code> function when you want to <em>move</em>, instead of <em>borrow</em>, your value. The <code>.into_iter()</code> function creates a <code>IntoIter<T></code> type that now has ownership of the original value. Like <code>Iter<'a, T></code>, it is this <code>IntoIter<T></code> type that actually implements the <code>Iterator</code> trait. The word <em>into</em> is commonly used in Rust to signal that <code>T</code> is being <em>moved</em>. The docs also use the words <em>owned</em> or <em>consumed</em> interchangeably with <em>moved</em>. I normally find myself using <code>.into_iter()</code> when I have a function that is transforming some values:</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">get_names</span><span>(</span><span style="color:#bf616a;">v</span><span>: Vec<(String, </span><span style="color:#b48ead;">usize</span><span>)>) -> Vec<String> {
</span><span> v.</span><span style="color:#96b5b4;">into_iter</span><span>()
</span><span> .</span><span style="color:#96b5b4;">map</span><span>(|(</span><span style="color:#bf616a;">name</span><span>, </span><span style="color:#bf616a;">_score</span><span>)| name)
</span><span> .</span><span style="color:#96b5b4;">collect</span><span>()
</span><span>}
</span><span>
</span><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">main</span><span>() {
</span><span> </span><span style="color:#b48ead;">let</span><span> v = vec!( ("</span><span style="color:#a3be8c;">Herman</span><span>".</span><span style="color:#96b5b4;">to_string</span><span>(), </span><span style="color:#d08770;">5</span><span>));
</span><span> </span><span style="color:#b48ead;">let</span><span> names = </span><span style="color:#96b5b4;">get_names</span><span>(v);
</span><span>
</span><span> assert_eq!(names, ["</span><span style="color:#a3be8c;">Herman</span><span>"]);
</span><span>}
</span></code></pre>
<p>The <code>get_names</code> function is plucking out the name from a list of tuples. I chose <code>.into_iter()</code> here because we are transforming the tuple into a <code>String</code> type.</p>
<p>The concept behind <code>.into_iter()</code> is similar to the <a href="https://doc.rust-lang.org/nightly/core/convert/trait.Into.html">core::convert::Into</a> trait we discussed when accepting <code>&str</code> and <code>String</code> in a function. In fact, the <a href="https://doc.rust-lang.org/stable/std/iter/trait.Iterator.html">std::iter::Iterator</a> type implements <a href="https://github.com/rust-lang/rust/blob/b5b3a99f84f2b4dbf9495dccd7112c74f4357acc/src/libcore/iter.rs#L1184-1192">std::iter::IntoIterator</a> too. That means we can do something like <code>vec![1, 2, 3, 4].into_iter().into_iter().into_iter()</code>. In each subsequent call to <code>.into_iter()</code> just returns itself. This is an example of the <a href="https://en.wikipedia.org/wiki/Identity_function">identity function</a>. I mention that only because I find it interesting to identify functional concepts that I see being used in the wild.</p>
<h3 id="how-for-loops-actually-work"><a class="zola-anchor" href="#how-for-loops-actually-work" aria-label="Anchor link for: how-for-loops-actually-work">How for Loops Actually Work</a></h3>
<p>One of the first errors a new Rustacean will run into is the <em>move</em> error after using a for loop:</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">main</span><span>() {
</span><span> </span><span style="color:#b48ead;">let</span><span> values = vec![</span><span style="color:#d08770;">1</span><span>, </span><span style="color:#d08770;">2</span><span>, </span><span style="color:#d08770;">3</span><span>, </span><span style="color:#d08770;">4</span><span>];
</span><span>
</span><span> </span><span style="color:#b48ead;">for</span><span> x in values {
</span><span> println!("</span><span style="color:#d08770;">{}</span><span>", x);
</span><span> }
</span><span>
</span><span> </span><span style="color:#b48ead;">let</span><span> y = values; </span><span style="color:#65737e;">// move error
</span><span>}
</span></code></pre>
<p>The question we immediately ask ourselves is "How do I create a for loop that uses a reference?". A <a href="https://doc.rust-lang.org/stable/std/iter/index.html">for loop</a> in Rust is really just syntatic sugar around <code>.into_iter()</code>. From the manual:</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#65737e;">// Rough translation of the iteration without a `for` iterator.
</span><span style="color:#b48ead;">let mut</span><span> it = values.</span><span style="color:#96b5b4;">into_iter</span><span>();
</span><span style="color:#b48ead;">loop </span><span>{
</span><span> </span><span style="color:#b48ead;">match</span><span> it.</span><span style="color:#96b5b4;">next</span><span>() {
</span><span> Some(x) => println!("</span><span style="color:#d08770;">{}</span><span>", x),
</span><span> None => </span><span style="color:#b48ead;">break</span><span>,
</span><span> }
</span><span>}
</span></code></pre>
<p>Now that we know <code>.into_iter()</code> creates a type <code>IntoIter<T></code> that <em>moves</em> <code>T</code>, this behavior makes perfect sense. If we want to use <code>values</code> after the for loop, we just need to use a reference instead:</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">main</span><span>() {
</span><span> </span><span style="color:#b48ead;">let</span><span> values = vec![</span><span style="color:#d08770;">1</span><span>, </span><span style="color:#d08770;">2</span><span>, </span><span style="color:#d08770;">3</span><span>, </span><span style="color:#d08770;">4</span><span>];
</span><span>
</span><span> </span><span style="color:#b48ead;">for</span><span> x in &values {
</span><span> println!("</span><span style="color:#d08770;">{}</span><span>", x);
</span><span> }
</span><span>
</span><span> </span><span style="color:#b48ead;">let</span><span> y = values; </span><span style="color:#65737e;">// perfectly valid
</span><span>}
</span></code></pre>
<p>Instead of moving <code>values</code>, which is type <code>Vec<i32></code>, we are moving <code>&values</code>, which is type <code>&Vec<i32></code>. The for loop only <em>borrows</em> <code>&values</code> for the duration of the loop and we are able to <em>move</em> <code>values</code> as soon as the for loop is done.</p>
<h2 id="core-iter-cloned"><a class="zola-anchor" href="#core-iter-cloned" aria-label="Anchor link for: core-iter-cloned">core::iter::Cloned</a></h2>
<p>There are times when you want create a new value when iterating over your original value. You might first try something like:</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">main</span><span>() {
</span><span> </span><span style="color:#b48ead;">let</span><span> x = vec!["</span><span style="color:#a3be8c;">Jill</span><span>", "</span><span style="color:#a3be8c;">Jack</span><span>", "</span><span style="color:#a3be8c;">Jane</span><span>", "</span><span style="color:#a3be8c;">John</span><span>"];
</span><span>
</span><span> </span><span style="color:#b48ead;">let </span><span>_ = x
</span><span> .</span><span style="color:#96b5b4;">clone</span><span>()
</span><span> .</span><span style="color:#96b5b4;">into_iter</span><span>()
</span><span> .collect::<Vec<_>>();
</span><span>}
</span></code></pre>
<p>Exercise for the reader: <em>Why would <code>.iter()</code> not work in this example?</em></p>
<p>While this is valid, we want to give Rust every chance to optimize our code. What if we only wanted the first two names from that list?</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">main</span><span>() {
</span><span> </span><span style="color:#b48ead;">let</span><span> x = vec!["</span><span style="color:#a3be8c;">Jill</span><span>", "</span><span style="color:#a3be8c;">Jack</span><span>", "</span><span style="color:#a3be8c;">Jane</span><span>", "</span><span style="color:#a3be8c;">John</span><span>"];
</span><span>
</span><span> </span><span style="color:#b48ead;">let </span><span>_ = x
</span><span> .</span><span style="color:#96b5b4;">clone</span><span>()
</span><span> .</span><span style="color:#96b5b4;">into_iter</span><span>()
</span><span> .</span><span style="color:#96b5b4;">take</span><span>(</span><span style="color:#d08770;">2</span><span>)
</span><span> .collect::<Vec<_>>();
</span><span>}
</span></code></pre>
<p>If we clone all of <code>x</code>, then we are cloning all four elements, but we only need two of them. We can do better by using <code>.map()</code> to clone the elements of the underlying iterator:</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">main</span><span>() {
</span><span> </span><span style="color:#b48ead;">let</span><span> x = vec!["</span><span style="color:#a3be8c;">Jill</span><span>", "</span><span style="color:#a3be8c;">Jack</span><span>", "</span><span style="color:#a3be8c;">Jane</span><span>", "</span><span style="color:#a3be8c;">John</span><span>"];
</span><span>
</span><span> </span><span style="color:#b48ead;">let</span><span> y = x
</span><span> .</span><span style="color:#96b5b4;">iter</span><span>()
</span><span> .</span><span style="color:#96b5b4;">map</span><span>(|</span><span style="color:#bf616a;">i</span><span>| i.</span><span style="color:#96b5b4;">clone</span><span>())
</span><span> .</span><span style="color:#96b5b4;">take</span><span>(</span><span style="color:#d08770;">2</span><span>)
</span><span> .collect::<Vec<_>>();
</span><span>}
</span></code></pre>
<p>The Rust compiler can now optimize this code and only clone two out of the four elements of <code>x</code>. This pattern is used so often that Rust core now has a special function that does this for us called <a href="https://doc.rust-lang.org/stable/std/iter/trait.Iterator.html#method.cloned">cloned()</a>. This is a recent addition and will be stable in Rust 1.1. Our code now looks something like:</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">main</span><span>() {
</span><span> </span><span style="color:#b48ead;">let</span><span> x = vec!["</span><span style="color:#a3be8c;">Jill</span><span>", "</span><span style="color:#a3be8c;">Jack</span><span>", "</span><span style="color:#a3be8c;">Jane</span><span>", "</span><span style="color:#a3be8c;">John</span><span>"];
</span><span>
</span><span> </span><span style="color:#b48ead;">let</span><span> y = x
</span><span> .</span><span style="color:#96b5b4;">iter</span><span>()
</span><span> .</span><span style="color:#96b5b4;">cloned</span><span>()
</span><span> .</span><span style="color:#96b5b4;">take</span><span>(</span><span style="color:#d08770;">2</span><span>)
</span><span> .collect::<Vec<_>>();
</span><span>}
</span></code></pre>
<h2 id="iterators-outside-of-core"><a class="zola-anchor" href="#iterators-outside-of-core" aria-label="Anchor link for: iterators-outside-of-core">Iterators Outside of Core</a></h2>
<p>There is a really great crate, called <a href="https://crates.io/crates/itertools">itertools</a>, that provides extra iterator adaptors, iterator methods and macros. If you are looking for some iterator functionality in the Rust docs and do not see it, there is a good chance it is part of itertools. I recently added an <a href="http://bluss.github.io/rust-itertools/doc/itertools/trait.Itertools.html#method.sort_by">itertools::IterTools::sort_by()</a> function so we can sort collections without needed to use a mutable iterator. One of the nice things about working with Rust is that the documentation looks the same across all these crates. The <a href="http://bluss.github.io/rust-itertools/doc/itertools/index.html">documentation for itertools</a> looks the same as the <a href="https://doc.rust-lang.org/std/">documentation for Rust std library</a>.</p>
<h2 id="related"><a class="zola-anchor" href="#related" aria-label="Anchor link for: related">Related</a></h2>
<ul>
<li><a href="/2015/05/06/creating-a-rust-function-that-accepts-string-or-str.html">Creating a Rust function that accepts String or &str</a></li>
</ul>
http://activitystrea.ms/schema/1.0/postStrategies for solving 'cannot move out of' borrowing errors in Rust2015-06-09T00:00:00+00:002015-06-09T00:00:00+00:00Herman J. Radtke IIIhttp://activitystrea.ms/schema/1.0/personhttps://hermanradtke.com/about/https://hermanradtke.com/2015/06/09/strategies-for-solving-cannot-move-out-of-borrowing-errors-in-rust.html/<p>The rules around <a href="https://doc.rust-lang.org/stable/book/references-and-borrowing.html#the-rules">references and borrowing</a> in Rust are fairly straight-forward. Given an owned variable, we are allowed to have as many <em>immutable</em> references to that variable as we want. Rust defaults to immutability, so even functions like <a href="https://doc.rust-lang.org/stable/std/primitive.str.html#method.trim">trim</a> are written in such a way that the result is a reference to the original string:</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">main</span><span>() {
</span><span> </span><span style="color:#b48ead;">let</span><span> name = "</span><span style="color:#a3be8c;"> Herman </span><span>".</span><span style="color:#96b5b4;">to_string</span><span>();
</span><span> </span><span style="color:#b48ead;">let</span><span> trimmed_name = name.</span><span style="color:#96b5b4;">trim</span><span>(); </span><span style="color:#65737e;">// == &[1..n-1]
</span><span>}
</span></code></pre>
<p>The only caveat is that I cannot <em>move</em> the <code>name</code> variable anymore. If I try to move <code>name</code>, the compiler will give me an error: <em>cannot move out of <code>name</code> because it is borrowed</em>.</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">main</span><span>() {
</span><span> </span><span style="color:#b48ead;">let</span><span> name = "</span><span style="color:#a3be8c;"> Herman </span><span>".</span><span style="color:#96b5b4;">to_string</span><span>();
</span><span> </span><span style="color:#b48ead;">let</span><span> trimmed_name = name.</span><span style="color:#96b5b4;">trim</span><span>();
</span><span>
</span><span> </span><span style="color:#b48ead;">let</span><span> owned_name = name; </span><span style="color:#65737e;">// move error
</span><span>}
</span></code></pre>
<p>The compiler knows that <code>trimmed_name</code> is a reference to <code>name</code>. As long as <code>trimmed_name</code> is still in scope, the compiler will not let us pass <code>name</code> to a function, reassign it or do any other <em>move</em> operation. We could <code>clone()</code> the <code>name</code> variable and then trim it, but we really just want to let the compiler know when we are done <em>borrowing</em> <code>name</code>. The key word here is <em>scope</em>. If the reference to <code>name</code> goes out of scope, the compiler will let us <em>move</em> <code>name</code> because it is no longer being <em>borrowed</em>. Let us wrap the call to <code>trim()</code> in curly braces to denote a different scope.</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">main</span><span>() {
</span><span> </span><span style="color:#b48ead;">let</span><span> name = "</span><span style="color:#a3be8c;"> Herman </span><span>".</span><span style="color:#96b5b4;">to_string</span><span>();
</span><span>
</span><span> {
</span><span> </span><span style="color:#b48ead;">let</span><span> trimmed_name = name.</span><span style="color:#96b5b4;">trim</span><span>();
</span><span> }
</span><span>
</span><span> </span><span style="color:#b48ead;">let</span><span> owned_name = name;
</span><span>}
</span></code></pre>
<p>That is simple enough, but let us take it a step further. Suppose we wanted to get back the length of the trimmed string from within our scope. If we do that inside our curly braces, then <code>trimmed_name_len</code> will no longer exist once we leave that scope.</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">main</span><span>() {
</span><span> </span><span style="color:#b48ead;">let</span><span> name = "</span><span style="color:#a3be8c;"> Herman </span><span>".</span><span style="color:#96b5b4;">to_string</span><span>();
</span><span>
</span><span> {
</span><span> </span><span style="color:#b48ead;">let</span><span> trimmed_name = name.</span><span style="color:#96b5b4;">trim</span><span>();
</span><span> </span><span style="color:#b48ead;">let</span><span> trimmed_name_len = trimmed_name.</span><span style="color:#96b5b4;">len</span><span>();
</span><span> }
</span><span>
</span><span> println!("</span><span style="color:#a3be8c;">Length of trimmed string is </span><span style="color:#d08770;">{}</span><span>", trimmed_name_len); </span><span style="color:#65737e;">// no such variable error
</span><span> </span><span style="color:#b48ead;">let</span><span> owned_name = name;
</span><span>}
</span></code></pre>
<h2 id="strategies"><a class="zola-anchor" href="#strategies" aria-label="Anchor link for: strategies">Strategies</a></h2>
<p>There are a few ways to deal with this. They all look pretty similar, but have different trade-offs. We can return the value from a scoped block of code:</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">main</span><span>() {
</span><span> </span><span style="color:#b48ead;">let</span><span> name = "</span><span style="color:#a3be8c;"> Herman </span><span>".</span><span style="color:#96b5b4;">to_string</span><span>();
</span><span>
</span><span> </span><span style="color:#b48ead;">let</span><span> trimmed_name_len = {
</span><span> </span><span style="color:#b48ead;">let</span><span> trimmed_name = name.</span><span style="color:#96b5b4;">trim</span><span>();
</span><span> trimmed_name.</span><span style="color:#96b5b4;">len</span><span>()
</span><span> };
</span><span>
</span><span> println!("</span><span style="color:#a3be8c;">Length of trimmed string is </span><span style="color:#d08770;">{}</span><span>", trimmed_name_len);
</span><span> </span><span style="color:#b48ead;">let</span><span> owned_name = name;
</span><span>}
</span></code></pre>
<p>This is a cheap and quick way to force the reference to go out of scope. It does not require us to specify parameters or their types nor does it require us to specify the return type. It is not reusable though. We can get some more reuse if we use an anonymous function (or closure):</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">main</span><span>() {
</span><span> </span><span style="color:#b48ead;">let</span><span> name = "</span><span style="color:#a3be8c;"> Herman </span><span>".</span><span style="color:#96b5b4;">to_string</span><span>();
</span><span>
</span><span> </span><span style="color:#b48ead;">let </span><span style="color:#8fa1b3;">f </span><span>= |</span><span style="color:#bf616a;">name</span><span>: &</span><span style="color:#b48ead;">str</span><span>| {
</span><span> </span><span style="color:#b48ead;">let</span><span> trimmed_name = name.</span><span style="color:#96b5b4;">trim</span><span>();
</span><span> trimmed_name.</span><span style="color:#96b5b4;">len</span><span>()
</span><span> };
</span><span>
</span><span> </span><span style="color:#b48ead;">let</span><span> trimmed_name_len = </span><span style="color:#96b5b4;">f</span><span>(&name);
</span><span>
</span><span> println!("</span><span style="color:#a3be8c;">Length of trimmed string is </span><span style="color:#d08770;">{}</span><span>", trimmed_name_len);
</span><span> </span><span style="color:#b48ead;">let</span><span> owned_name = name;
</span><span>}
</span></code></pre>
<p>A closure requires us to specify parameters and their types, but makes specifying the return type optional. The way this is written, the anonymous function <code>f</code> is only usable within the function scope. If we want complete reusuability we can use a normal function:</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">len_of_trimmed_string</span><span>(</span><span style="color:#bf616a;">name</span><span>: &</span><span style="color:#b48ead;">str</span><span>) -> </span><span style="color:#b48ead;">usize </span><span>{
</span><span> </span><span style="color:#b48ead;">let</span><span> trimmed_name = name.</span><span style="color:#96b5b4;">trim</span><span>();
</span><span> trimmed_name.</span><span style="color:#96b5b4;">len</span><span>()
</span><span>}
</span><span>
</span><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">main</span><span>() {
</span><span> </span><span style="color:#b48ead;">let</span><span> name = "</span><span style="color:#a3be8c;"> Herman </span><span>".</span><span style="color:#96b5b4;">to_string</span><span>();
</span><span>
</span><span> </span><span style="color:#b48ead;">let</span><span> trimmed_name_len = </span><span style="color:#96b5b4;">len_of_trimmed_string</span><span>(name.</span><span style="color:#96b5b4;">as_ref</span><span>());
</span><span>
</span><span> println!("</span><span style="color:#a3be8c;">Length of trimmed string is </span><span style="color:#d08770;">{}</span><span>", trimmed_name_len);
</span><span> </span><span style="color:#b48ead;">let</span><span> owned_name = name;
</span><span>}
</span></code></pre>
<p>These strategies only work if we are calling immutable functions. We are temporarily keeping the reference to get some other peice of information. This works really well that information is something like implements the <code>Copy</code> trait, such as numbers or booleans. If we wanted to do something like remove all spaces on a string like <code>"H e r m a n"</code> then we are mutating the string. We would have to call <code>name.clone()</code> in order to later <em>move</em> the original <code>name</code> variable.</p>
<h3 id="closure-without-parameters"><a class="zola-anchor" href="#closure-without-parameters" aria-label="Anchor link for: closure-without-parameters">Closure Without Parameters</a></h3>
<p>You may have wondered if we really did have to specify parameters when using a closure. If we try to access the <code>name</code> variable from within the closure, it will create a reference during compile time. That reference will continue to exist, even if we try to remove the closure <code>f</code> from scope. Example:</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">main</span><span>() {
</span><span> </span><span style="color:#b48ead;">let</span><span> name = "</span><span style="color:#a3be8c;"> Herman </span><span>".</span><span style="color:#96b5b4;">to_string</span><span>();
</span><span>
</span><span> </span><span style="color:#b48ead;">let </span><span style="color:#8fa1b3;">f </span><span>= || {
</span><span> </span><span style="color:#b48ead;">let</span><span> trimmed_name = name.</span><span style="color:#96b5b4;">trim</span><span>();
</span><span> trimmed_name.</span><span style="color:#96b5b4;">len</span><span>()
</span><span> };
</span><span>
</span><span> </span><span style="color:#b48ead;">let</span><span> trimmed_name_len = </span><span style="color:#96b5b4;">f</span><span>();
</span><span>
</span><span> println!("</span><span style="color:#a3be8c;">Length of trimmed string is </span><span style="color:#d08770;">{}</span><span>", trimmed_name_len);
</span><span> </span><span style="color:#b48ead;">let</span><span> owned_name = name; </span><span style="color:#65737e;">// move error
</span><span>}
</span></code></pre>
<pre style="background-color:#2b303b;color:#c0c5ce;"><code><span>error: cannot move out of `name` because it is borrowed
</span><span> let owned_name = name;
</span><span> ^~~~~~~~~~
</span><span>note: borrow of `name` occurs here
</span><span> let f = || {
</span><span> let trimmed_name = name.trim();
</span><span> trimmed_name.len()
</span><span> };
</span><span>note: in expansion of closure expansion
</span></code></pre>
<h2 id="real-world-example"><a class="zola-anchor" href="#real-world-example" aria-label="Anchor link for: real-world-example">Real World Example</a></h2>
<p>The above examples are pretty contrived. However, you will run into this when you are breaking down functions into smaller parts. In this below example, I was using a <code>find_matches</code> function that required an input of type <code>&str</code>. Given a <code>PathBuf</code>, I needed to call the immutable <code>file_name()</code> method on it and then convert it to a <code>&str</code> by calling <code>to_str()</code> before calling <code>find_matches(file_name)</code>. In order to return a tuple of <code>(p, matches)</code>, I had to make sure reference created by <code>file_name</code> was out of scope. I chose to use a function, but could have use curly braces or a closure as we discussed above.</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">find_matches</span><span>(</span><span style="color:#bf616a;">s</span><span>: &</span><span style="color:#b48ead;">str</span><span>) -> </span><span style="color:#b48ead;">f64 </span><span>{
</span><span> </span><span style="color:#65737e;">// ...
</span><span>}
</span><span>
</span><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">count_filename_matches</span><span>(</span><span style="color:#bf616a;">path</span><span>: &Path) -> </span><span style="color:#b48ead;">f64 </span><span>{
</span><span> </span><span style="color:#b48ead;">let</span><span> file_name = path.</span><span style="color:#96b5b4;">file_name</span><span>()
</span><span> .</span><span style="color:#96b5b4;">and_then</span><span>(|</span><span style="color:#bf616a;">f</span><span>| f.</span><span style="color:#96b5b4;">to_str</span><span>())
</span><span> .</span><span style="color:#96b5b4;">unwrap_or_else</span><span>(|| {
</span><span> debug!("</span><span style="color:#a3be8c;">Unable to determine filename for {:?}</span><span>", path);
</span><span> ""
</span><span> });
</span><span>
</span><span> </span><span style="color:#96b5b4;">find_matches</span><span>(file_name)
</span><span>}
</span><span>
</span><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">find_filename_matches_in_path</span><span>(</span><span style="color:#bf616a;">path</span><span>: &</span><span style="color:#b48ead;">str</span><span>) -> Vec<(PathBuf, </span><span style="color:#b48ead;">f64</span><span>)> {
</span><span> fs::read_dir(path).</span><span style="color:#96b5b4;">unwrap</span><span>()
</span><span> .</span><span style="color:#96b5b4;">map</span><span>(|</span><span style="color:#bf616a;">p</span><span>| p.</span><span style="color:#96b5b4;">unwrap</span><span>().</span><span style="color:#96b5b4;">path</span><span>())
</span><span> .</span><span style="color:#96b5b4;">map</span><span>(|</span><span style="color:#bf616a;">p</span><span>| {
</span><span> </span><span style="color:#b48ead;">let</span><span> matches = </span><span style="color:#96b5b4;">count_filename_matches</span><span>(p.</span><span style="color:#96b5b4;">as_ref</span><span>(), cmd);
</span><span> (p, matches)
</span><span> })
</span><span> .</span><span style="color:#96b5b4;">filter</span><span>(|&(</span><span style="color:#bf616a;">ref _p</span><span>, </span><span style="color:#bf616a;">matches</span><span>)| {
</span><span> matches > </span><span style="color:#d08770;">0.0
</span><span> })
</span><span> .</span><span style="color:#96b5b4;">collect</span><span>()
</span><span>}
</span></code></pre>
http://activitystrea.ms/schema/1.0/postCreating a Rust function that returns a &str or String2015-05-29T00:00:00+00:002015-05-29T00:00:00+00:00Herman J. Radtke IIIhttp://activitystrea.ms/schema/1.0/personhttps://hermanradtke.com/about/https://hermanradtke.com/2015/05/29/creating-a-rust-function-that-returns-string-or-str.html/<link rel="alternate" href="http://habrahabr.ru/post/274565/" hreflang="ru" />
<link rel="alternate" href="/2015/05/29/creating-a-rust-function-that-returns-string-or-str.html" hreflang="en" />
<link rel="alternate" href="/2015/05/29/creating-a-rust-function-that-returns-string-or-str.html" hreflang="x-default" />
<p><a href="http://habrahabr.ru/post/274565/">Russian Translation</a></p>
<p>We learned how to <a href="/2015/05/06/creating-a-rust-function-that-accepts-string-or-str.html">create a function that accepts String or &str</a> as an argument. Now I want to show you how to create a function that returns either <code>String</code> or <code>&str</code>. I also want to discuss why we would want to do this. To start, let us write a function to remove all the spaces from a given string. Our function might look something like this:</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">remove_spaces</span><span>(</span><span style="color:#bf616a;">input</span><span>: &</span><span style="color:#b48ead;">str</span><span>) -> String {
</span><span> </span><span style="color:#b48ead;">let mut</span><span> buf = String::with_capacity(input.</span><span style="color:#96b5b4;">len</span><span>());
</span><span>
</span><span> </span><span style="color:#b48ead;">for</span><span> c in input.</span><span style="color:#96b5b4;">chars</span><span>() {
</span><span> </span><span style="color:#b48ead;">if</span><span> c != ' ' {
</span><span> buf.</span><span style="color:#96b5b4;">push</span><span>(c);
</span><span> }
</span><span> }
</span><span>
</span><span> buf
</span><span>}
</span><span>
</span></code></pre>
<p>This function allocates memory for a string buffer, loops through each character of <code>input</code> and appends all non-space characters to the string buffer. Now I ask: what if my input did not contain spaces at all? The value <code>input</code> would be the same as <code>buf</code>. In that case, it would be more efficient to not create <code>buf</code> in the first place. Instead, we would like to just return the given <code>input</code> back to the caller. The type of <code>input</code> is a <code>&str</code> but our function returns a String though. We could change the type of <code>input</code> to a <code>String</code>:</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">remove_spaces</span><span>(</span><span style="color:#bf616a;">input</span><span>: String) -> String { ... }
</span></code></pre>
<p>but this causes two problems. First, by making <code>input</code> of type <code>String</code> we are forcing the caller to <em>move</em> the ownership of <code>input</code> into our function. This prevents the caller from using that value in the future. We should only take ownership of <code>input</code> if we actually need it. Second, the input might already be of type <code>&str</code> and we are now forcing the caller to convert it into a <code>String</code> which defeats our attempts to not allocate new memory when creating <code>buf</code>.</p>
<h2 id="clone-on-write"><a class="zola-anchor" href="#clone-on-write" aria-label="Anchor link for: clone-on-write">Clone-on-write</a></h2>
<p>What we really want is the ability to return our input string (<code>&str</code>) if there are no spaces and to return a new string (<code>String</code>) if there are spaces we need to remove. This is where the clone-on-write or <a href="https://doc.rust-lang.org/stable/std/borrow/enum.Cow.html">Cow</a> type can be used. The <code>Cow</code> type allows us to abstract away whether something is <code>Owned</code> or <code>Borrowed</code>. In our example, the <code>&str</code> is a reference to an existing string so that would be <em>borrowed</em> data. If there are spaces, then we need to allocate memory for a new <code>String</code>. That new <code>String</code> is <em>owned</em> by the <code>buf</code> variable. Normally, we would <em>move</em> the ownership of <code>buf</code> by returning it to the caller. When using <code>Cow</code>, we want to <em>move</em> the ownership of <code>buf</code> into the <code>Cow</code> type and return that.</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">use </span><span>std::borrow::Cow;
</span><span>
</span><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">remove_spaces</span><span><</span><span style="color:#b48ead;">'a</span><span>>(</span><span style="color:#bf616a;">input</span><span>: &</span><span style="color:#b48ead;">'a str</span><span>) -> Cow<</span><span style="color:#b48ead;">'a</span><span>, </span><span style="color:#b48ead;">str</span><span>> {
</span><span> </span><span style="color:#b48ead;">if</span><span> input.</span><span style="color:#96b5b4;">contains</span><span>(' ') {
</span><span> </span><span style="color:#b48ead;">let mut</span><span> buf = String::with_capacity(input.</span><span style="color:#96b5b4;">len</span><span>());
</span><span>
</span><span> </span><span style="color:#b48ead;">for</span><span> c in input.</span><span style="color:#96b5b4;">chars</span><span>() {
</span><span> </span><span style="color:#b48ead;">if</span><span> c != ' ' {
</span><span> buf.</span><span style="color:#96b5b4;">push</span><span>(c);
</span><span> }
</span><span> }
</span><span>
</span><span> </span><span style="color:#b48ead;">return </span><span>Cow::Owned(buf);
</span><span> }
</span><span>
</span><span> </span><span style="color:#b48ead;">return </span><span>Cow::Borrowed(input);
</span><span>}
</span></code></pre>
<p>Our function now checks to see if the given <code>input</code> contains a space and only then allocates memory for a new buffer. If the <code>input</code> does not contain a space, the <code>input</code> is simply returned. We are adding a bit of <a href="https://en.wikipedia.org/wiki/Analysis_of_algorithms">runtime complexity</a> to optimize how we allocate memory. Notice that our <code>Cow</code> type has the same lifetime of the <code>&str</code> type. As we discussed previously, the compiler needs to track the <code>&str</code> reference to know when it can safely free (or <code>Drop</code>) the memory.</p>
<p>The beauty of <code>Cow</code> is that it implements the <code>Deref</code> trait so you can call immutable functions without knowing whether or not the result is a new string buffer or not. Example:</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">let</span><span> s = </span><span style="color:#96b5b4;">remove_spaces</span><span>("</span><span style="color:#a3be8c;">Herman Radtke</span><span>");
</span><span>println!("</span><span style="color:#a3be8c;">Length of string is </span><span style="color:#d08770;">{}</span><span>", s.</span><span style="color:#96b5b4;">len</span><span>());
</span></code></pre>
<p>If I do need to mutate <code>s</code>, then I can convert it into an <em>owned</em> variable using the <code>into_owned()</code> function. If the variant of <code>Cow</code> was already <code>Owned</code> then we are simply moving ownership. If the variant of <code>Cow</code> is <code>Borrowed</code>, then we are allocating memory. This allows us to lazily clone (allocate memory) only when we want to write (or mutate) the variable.</p>
<p>Example where a <code>Cow::Borrowed</code> is mutated:</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">let</span><span> s = </span><span style="color:#96b5b4;">remove_spaces</span><span>("</span><span style="color:#a3be8c;">Herman</span><span>"); </span><span style="color:#65737e;">// s is a Cow::Borrowed variant
</span><span style="color:#b48ead;">let</span><span> len = s.</span><span style="color:#96b5b4;">len</span><span>(); </span><span style="color:#65737e;">// immutable function call using Deref
</span><span style="color:#b48ead;">let</span><span> owned: String = s.</span><span style="color:#96b5b4;">into_owned</span><span>(); </span><span style="color:#65737e;">// memory is allocated for a new string
</span></code></pre>
<p>Example where a <code>Cow::Owned</code> is mutated:</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">let</span><span> s = </span><span style="color:#96b5b4;">remove_spaces</span><span>("</span><span style="color:#a3be8c;">Herman Radtke</span><span>"); </span><span style="color:#65737e;">// s is a Cow::Owned variant
</span><span style="color:#b48ead;">let</span><span> len = s.</span><span style="color:#96b5b4;">len</span><span>(); </span><span style="color:#65737e;">// immutable function call using Deref
</span><span style="color:#b48ead;">let</span><span> owned: String = s.</span><span style="color:#96b5b4;">into_owned</span><span>(); </span><span style="color:#65737e;">// no new memory allocated as we already had a String
</span></code></pre>
<p>The idea behind <code>Cow</code> is two-fold:</p>
<ol>
<li>Delay the allocation of memory for as long as possible. In the best case, we never have to allocate any new memory.</li>
<li>Allow the caller of our <code>remove_spaces</code> function to not care if memory was allocated or not. The usage of the <code>Cow</code> type is the same in either case.</li>
</ol>
<h3 id="leveraging-the-into-trait"><a class="zola-anchor" href="#leveraging-the-into-trait" aria-label="Anchor link for: leveraging-the-into-trait">Leveraging the <code>Into</code> Trait</a></h3>
<p>We previously discussed using the <a href="/2015/05/06/creating-a-rust-function-that-accepts-string-or-str.html"><code>Into</code> trait</a> to convert a <code>&str</code> into a <code>String</code>. We can also use the <code>Into</code> trait to convert the <code>&str</code> or <code>String</code> into the proper <code>Cow</code> variant. By calling <code>.into()</code> the compiler will perform the conversion automatically. Using <code>.into()</code> will not speed up or slow down the code. It is simply an option to avoid having to specify <code>Cow::Owned</code> or <code>Cow::Borrowed</code> explicitly.</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">remove_spaces</span><span><</span><span style="color:#b48ead;">'a</span><span>>(</span><span style="color:#bf616a;">input</span><span>: &</span><span style="color:#b48ead;">'a str</span><span>) -> Cow<</span><span style="color:#b48ead;">'a</span><span>, </span><span style="color:#b48ead;">str</span><span>> {
</span><span> </span><span style="color:#b48ead;">if</span><span> input.</span><span style="color:#96b5b4;">contains</span><span>(' ') {
</span><span> </span><span style="color:#b48ead;">let mut</span><span> buf = String::with_capacity(input.</span><span style="color:#96b5b4;">len</span><span>());
</span><span> </span><span style="color:#b48ead;">let</span><span> v: Vec<</span><span style="color:#b48ead;">char</span><span>> = input.</span><span style="color:#96b5b4;">chars</span><span>().</span><span style="color:#96b5b4;">collect</span><span>();
</span><span>
</span><span> </span><span style="color:#b48ead;">for</span><span> c in v {
</span><span> </span><span style="color:#b48ead;">if</span><span> c != ' ' {
</span><span> buf.</span><span style="color:#96b5b4;">push</span><span>(c);
</span><span> }
</span><span> }
</span><span>
</span><span> </span><span style="color:#b48ead;">return</span><span> buf.</span><span style="color:#96b5b4;">into</span><span>();
</span><span> }
</span><span> </span><span style="color:#b48ead;">return</span><span> input.</span><span style="color:#96b5b4;">into</span><span>();
</span><span>}
</span></code></pre>
<p>We can also clean this up a bit using just iterators:</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">remove_spaces</span><span><</span><span style="color:#b48ead;">'a</span><span>>(</span><span style="color:#bf616a;">input</span><span>: &</span><span style="color:#b48ead;">'a str</span><span>) -> Cow<</span><span style="color:#b48ead;">'a</span><span>, </span><span style="color:#b48ead;">str</span><span>> {
</span><span> </span><span style="color:#b48ead;">if</span><span> input.</span><span style="color:#96b5b4;">contains</span><span>(' ') {
</span><span> input
</span><span> .</span><span style="color:#96b5b4;">chars</span><span>()
</span><span> .</span><span style="color:#96b5b4;">filter</span><span>(|&</span><span style="color:#bf616a;">x</span><span>| x != ' ')
</span><span> .collect::<std::string::String>()
</span><span> .</span><span style="color:#96b5b4;">into</span><span>()
</span><span> } </span><span style="color:#b48ead;">else </span><span>{
</span><span> input.</span><span style="color:#96b5b4;">into</span><span>()
</span><span> }
</span><span>}
</span></code></pre>
<h2 id="real-world-uses-of-cow"><a class="zola-anchor" href="#real-world-uses-of-cow" aria-label="Anchor link for: real-world-uses-of-cow">Real World Uses of <code>Cow</code></a></h2>
<p>My example of removing spaces may seem a bit contrived, but there are some great real-world applications of this strategy. Inside of Rust core there is a function that <a href="https://github.com/rust-lang/rust/blob/720735b9430f7ff61761f54587b82dab45317938/src/libcollections/string.rs#L153">converts bytes to UTF-8 in a lossy manner</a> and a function that will <a href="https://github.com/rust-lang/rust/blob/c23a9d42ea082830593a73d25821842baf9ccf33/src/libsyntax/parse/lexer/mod.rs#L271">translate CRLF to LF</a>. Both of these functions have a case where a <code>&str</code> can be returned in the optimal case and another case where a <code>String</code> has to be allocated. Other examples I can think of are properly encoding an xml/html string or properly escaping a SQL query. In many cases, the input is already properly encoded or escaped. In those cases, it is better to just return the input string back to the caller. When the input does need to be modified we are forced to allocate new memory, in the form of a String buffer, and return that to the caller.</p>
<h2 id="why-use-string-with-capacity"><a class="zola-anchor" href="#why-use-string-with-capacity" aria-label="Anchor link for: why-use-string-with-capacity">Why use <code>String::with_capacity()</code> ?</a></h2>
<p>While we are on the topic of efficient memory management, notice that I used <code>String::with_capacity()</code> instead of <code>String::new()</code> when creating the string buffer. You can use <code>String::new()</code> instead of <code>String::with_capacity()</code>, but it is more efficient to allocate memory for the buffer all at once instead of re-allocating memory as we push more <code>char</code>s onto the buffer. Let us walk through what Rust does when we use <code>String::new()</code> and then push characters onto the string.</p>
<p>A <code>String</code> is really a <code>Vec</code> of UTF-8 code points. When <code>String::new()</code> is called, Rust creates a vector with zero bytes of capacity. If we then push the character <code>a</code> onto the string buffer, like <code>input.push('a')</code> , Rust has to increase the capacity of the vector. In this case, it will allocate 2 bytes of memory. As we push more characters and exceed the capacity, Rust will double the size of the string by re-allocating memory. It will continue to double the size each time the capacity is exceeded. The sequence of memory allocation is <code>0, 2, 4, 8, 16, 32 ... 2^n</code> where n is the number of times Rust detected that capacity was exceeded. Re-allocating memory is really slow (edit: kmc_v3 <a href="http://www.reddit.com/r/rust/comments/37q8sr/creating_a_rust_function_that_returns_a_str_or/croylbu">explained</a> that it might not be as slow as I thought). Not only does Rust have to ask the kernel for new memory, it must also copy the contents of the vector from the old memory space to the new memory space. Check out the source code for <a href="https://github.com/rust-lang/rust/blob/720735b9430f7ff61761f54587b82dab45317938/src/libcollections/vec.rs#L628">Vec::push</a> to see the resizing logic first-hand.</p>
<p>In general, we want to allocate new memory only when we need it and only allocate as much as we need. For small strings, like <code>remove_spaces("Herman Radtke")</code>, the overheard of re-allocating memory is not a big deal. What if I wanted to remove all of the spaces in each JavaScript file for my website? The overhead of re-allocating memory for a buffer is much higher. When pushing data onto a vector (String or otherwise) it can be a good idea to specify a capacity to start with. The best situation is when you already know the length and the capacity can be exactly set. The <a href="https://github.com/rust-lang/rust/blob/720735b9430f7ff61761f54587b82dab45317938/src/libcollections/vec.rs#L147-152">code comments</a> for <code>Vec</code> give a similar warning.</p>
<h2 id="related"><a class="zola-anchor" href="#related" aria-label="Anchor link for: related">Related</a></h2>
<ul>
<li><a href="http://hermanradtke.com/2015/05/03/string-vs-str-in-rust-functions.html">String vs &str in Rust functions</a></li>
<li><a href="http://hermanradtke.com/2015/05/06/creating-a-rust-function-that-accepts-string-or-str.html">Creating a Rust function that accepts String or &str</a></li>
</ul>
http://activitystrea.ms/schema/1.0/postCreating a Rust function that accepts String or &str2015-05-06T00:00:00+00:002015-05-06T00:00:00+00:00Herman J. Radtke IIIhttp://activitystrea.ms/schema/1.0/personhttps://hermanradtke.com/about/https://hermanradtke.com/2015/05/06/creating-a-rust-function-that-accepts-string-or-str.html/<link rel="alternate" href="http://habrahabr.ru/post/274455/" hreflang="ru" />
<link rel="alternate" href="/2015/05/06/creating-a-rust-function-that-accepts-string-or-str.html" hreflang="en" />
<link rel="alternate" href="/2015/05/06/creating-a-rust-function-that-accepts-string-or-str.html" hreflang="x-default" />
<p><a href="http://habrahabr.ru/post/274455/">Russian Translation</a></p>
<p>In my <a href="/2015/05/03/string-vs-str-in-rust-functions.html">last post</a> we talked a lot about using <code>&str</code> as the preferred type for functions accepting a string argument. Towards the end of that post there was some discussion about when to use <code>String</code> vs <code>&str</code> in a <code>struct</code>. I think this advice is good, but there are cases where using <code>&str</code> instead of <code>String</code> is not optimal. We need another strategy for these use cases.</p>
<h2 id="a-struct-containing-strings"><a class="zola-anchor" href="#a-struct-containing-strings" aria-label="Anchor link for: a-struct-containing-strings">A struct Containing Strings</a></h2>
<p>Consider the <code>Person</code> struct below. For the sake of discussion, let's say <code>Person</code> has a real need to own the <code>name</code> variable. We choose to use the <code>String</code> type instead of <code>&str</code>.</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">struct </span><span>Person {
</span><span> </span><span style="color:#bf616a;">name</span><span>: String,
</span><span>}
</span></code></pre>
<p>Now we need to implement a <code>new()</code> function. Based on my last blog post, we prefer a <code>&str</code>:</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">impl </span><span>Person {
</span><span> </span><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">new </span><span>(</span><span style="color:#bf616a;">name</span><span>: &</span><span style="color:#b48ead;">str</span><span>) -> Person {
</span><span> Person { name: name.</span><span style="color:#96b5b4;">to_string</span><span>() }
</span><span> }
</span><span>}
</span></code></pre>
<p>This works as long as we remember to call <code>.to_string()</code> inside of the <code>new()</code> function. However, the ergonomics of this function are less than desired. If we use a string literal, then we can make a new <code>Person</code> like <code>Person.new("Herman")</code>. If we already have a <code>String</code> though, we need to ask for a reference to the <code>String</code>:</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">let</span><span> name = "</span><span style="color:#a3be8c;">Herman</span><span>".</span><span style="color:#96b5b4;">to_string</span><span>();
</span><span style="color:#b48ead;">let</span><span> person = Person::new(name.</span><span style="color:#96b5b4;">as_ref</span><span>());
</span></code></pre>
<p>It feels like we are going in circles though. We had a <code>String</code>, then we called <code>as_ref()</code> to turn it into a <code>&str</code> only to then turn it back into a <code>String</code> inside of the <code>new()</code> function. We could go back to using a <code>String</code> like <code>fn new(name: String) -> Person {</code>, but that means we need to force the caller to use <code>.to_string()</code> whenever there is a string literal.</p>
<h2 id="into-conversions"><a class="zola-anchor" href="#into-conversions" aria-label="Anchor link for: into-conversions">Into<T> conversions</a></h2>
<p>We can make our function easier for the caller to work with by using the <a href="http://doc.rust-lang.org/nightly/core/convert/trait.Into.html">Into trait</a>. This trait will can automatically convert a <code>&str</code> into a <code>String</code>. If we already have a <code>String</code>, then no conversion happens.</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">struct </span><span>Person {
</span><span> </span><span style="color:#bf616a;">name</span><span>: String,
</span><span>}
</span><span>
</span><span style="color:#b48ead;">impl </span><span>Person {
</span><span> </span><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">new</span><span><S: Into<String>>(</span><span style="color:#bf616a;">name</span><span>: S) -> Person {
</span><span> Person { name: name.</span><span style="color:#96b5b4;">into</span><span>() }
</span><span> }
</span><span>}
</span><span>
</span><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">main</span><span>() {
</span><span> </span><span style="color:#b48ead;">let</span><span> person = Person::new("</span><span style="color:#a3be8c;">Herman</span><span>");
</span><span> </span><span style="color:#b48ead;">let</span><span> person = Person::new("</span><span style="color:#a3be8c;">Herman</span><span>".</span><span style="color:#96b5b4;">to_string</span><span>());
</span><span>}
</span></code></pre>
<p>This syntax for <code>new()</code> looks a little different. We are using <a href="http://doc.rust-lang.org/nightly/book/generics.html">Generics</a> and <a href="http://doc.rust-lang.org/nightly/book/traits.html">Traits</a> to tell Rust that some type <code>S</code> must implement the trait <code>Into</code> for type <code>String</code>. The <code>String</code> type implements <code>Into<String></code> as noop because we already have a <code>String</code>. The <code>&str</code> type implements <code>Into<String></code> by using the same <code>.to_string()</code> method we were originally doing in the <code>new()</code> function. So we aren't side-stepping the need for the <code>.to_string()</code> call, but we are taking away the need for the caller to do it. You might wonder if using <code>Into<String></code> hurts performance and the answer is no. Rust uses <a href="http://doc.rust-lang.org/nightly/book/trait-objects.html#static-dispatch">static dispatch</a> and the concept of <a href="http://stackoverflow.com/a/14198060/775246">monomorphization</a> to handle all this during the compiler phase.</p>
<p>Don't worry if things like <em>static dispatch</em> and <em>monomorphization</em> are confusing. You just need to know that using the syntax above you can create functions that accept both <code>String</code> and <code>&str</code>. If you are thinking that <code>fn new<S: Into<String>>(name: S) -> Person {</code> is a lot of syntax, it is. It is important to point out though that there is nothing special about <code>Into<String></code>. It is just a trait that is part of the Rust standard library. You could implement this trait yourself if you wanted to. You can implement similar traits you find useful and publish them on <a href="https://crates.io/">crates.io</a>. All this userland power is what makes Rust an awesome language.</p>
<h3 id="another-way-to-write-person-new"><a class="zola-anchor" href="#another-way-to-write-person-new" aria-label="Anchor link for: another-way-to-write-person-new">Another Way To Write Person::new()</a></h3>
<p>The <em>where</em> syntax also works and may be easier to read, especially if the function signature becomes more complex:</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">struct </span><span>Person {
</span><span> </span><span style="color:#bf616a;">name</span><span>: String,
</span><span>}
</span><span>
</span><span style="color:#b48ead;">impl </span><span>Person {
</span><span> </span><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">new</span><span><S>(</span><span style="color:#bf616a;">name</span><span>: S) -> Person </span><span style="color:#b48ead;">where</span><span> S: Into<String> {
</span><span> Person { name: name.</span><span style="color:#96b5b4;">into</span><span>() }
</span><span> }
</span><span>}
</span></code></pre>
<h2 id="related"><a class="zola-anchor" href="#related" aria-label="Anchor link for: related">Related</a></h2>
<ul>
<li><a href="http://hermanradtke.com/2015/05/03/string-vs-str-in-rust-functions.html">String vs &str in Rust functions</a></li>
<li><a href="http://hermanradtke.com/2015/05/29/creating-a-rust-function-that-returns-string-or-str.html">Creating a Rust function that returns a &str or String</a></li>
</ul>
http://activitystrea.ms/schema/1.0/postString vs &str in Rust functions2015-05-03T00:00:00+00:002015-05-03T00:00:00+00:00Herman J. Radtke IIIhttp://activitystrea.ms/schema/1.0/personhttps://hermanradtke.com/about/https://hermanradtke.com/2015/05/03/string-vs-str-in-rust-functions.html/<link rel="alternate" href="http://habrahabr.ru/post/274585/" hreflang="ru" />
<link rel="alternate" href="/2015/05/03/string-vs-str-in-rust-functions.html" hreflang="en" />
<link rel="alternate" href="/2015/05/03/string-vs-str-in-rust-functions.html" hreflang="x-default" />
<p><a href="https://habrahabr.ru/post/274485/">Russian Translation</a></p>
<p>For all the people frustrated by having to use <code>to_string()</code> to get programs to compile this post is for you. For those not quite understanding why Rust has two string types <code>String</code> and <code>&str</code>, I hope to shed a little light on the matter.</p>
<h2 id="functions-that-accept-a-string"><a class="zola-anchor" href="#functions-that-accept-a-string" aria-label="Anchor link for: functions-that-accept-a-string">Functions That Accept A String</a></h2>
<p>I want to discuss how to build interfaces that accept strings. I am an avid hypermedia fan and am obsessed about designing interfaces that are easy to use. Let's start with a method that accepts a <a href="https://doc.rust-lang.org/std/string/struct.String.html?search=String">String</a>. Our search hints that <code>std::string::String</code> is a good choice here.</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">print_me</span><span>(</span><span style="color:#bf616a;">msg</span><span>: String) {
</span><span> println!("</span><span style="color:#a3be8c;">the message is </span><span style="color:#d08770;">{}</span><span>", msg);
</span><span>}
</span><span>
</span><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">main</span><span>() {
</span><span> </span><span style="color:#b48ead;">let</span><span> msg = "</span><span style="color:#a3be8c;">hello world</span><span>";
</span><span> </span><span style="color:#96b5b4;">print_me</span><span>(msg);
</span><span>}
</span></code></pre>
<p>This gives a compiler error:</p>
<pre style="background-color:#2b303b;color:#c0c5ce;"><code><span>expected `collections::string::String`,
</span><span> found `&'static str`
</span></code></pre>
<p>So a string literal is of type <code>&str</code> and does not appear compatible with the type <code>String</code>. We can change the <code>message</code> type to a <code>String</code> and compile succesfully: <code>let message = "hello world".to_string();</code>. This works, but it is analogous to using <code>clone()</code> to get around ownership/borrowing errors. Here are three reasons to change <code>print_me</code> to accept a <code>&str</code> instead:</p>
<ul>
<li>The <code>&</code> symbol is a reference type and means we are <em>borrowing</em> the variable. When <code>print_me</code> is done with the variable, ownership will return to the original owner. Unless we have good reason to <em>move</em> ownership of the <code>message</code> variable into our function, we should elect to borrow.</li>
<li>Using a reference is more efficient. Using <code>String</code> for <code>message</code> means the program must <em>copy</em> the value. When using a reference, such as <code>&str</code>, no copy is made.</li>
<li>A <code>String</code> type can be magically turned into a <code>&str</code> type using the <a href="http://doc.rust-lang.org/nightly/std/ops/trait.Deref.html">Deref</a> trait and type coercion. This will make more sense with an example.</li>
</ul>
<h2 id="example-of-deref-coercion"><a class="zola-anchor" href="#example-of-deref-coercion" aria-label="Anchor link for: example-of-deref-coercion">Example of Deref Coercion</a></h2>
<p>This example creates strings in four different ways that all work with the <code>print_me</code> function. The key to making this all work is passing values by reference. Rather than passing <code>owned_string</code> as a <code>String</code> to <code>print_me</code>, we instead pass it as <code>&String</code>. When the compiler sees a <code>&String</code> being passed to a function that takes <code>&str</code>, it coerces the <code>&String</code> into a <code>&str</code>. This same coercion takes places for the reference counted and atomically referenced counted strings. The <code>string</code> variable is already a reference, so no need to use a <code>&</code> when calling <code>print_me(string)</code>. Knowing this, we no longer need to have <code>.to_string()</code> calls littering our code.</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">print_me</span><span>(</span><span style="color:#bf616a;">msg</span><span>: &</span><span style="color:#b48ead;">str</span><span>) { println!("</span><span style="color:#a3be8c;">msg = </span><span style="color:#d08770;">{}</span><span>", msg); }
</span><span>
</span><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">main</span><span>() {
</span><span> </span><span style="color:#b48ead;">let</span><span> string = "</span><span style="color:#a3be8c;">hello world</span><span>";
</span><span> </span><span style="color:#96b5b4;">print_me</span><span>(string);
</span><span>
</span><span> </span><span style="color:#b48ead;">let</span><span> owned_string = "</span><span style="color:#a3be8c;">hello world</span><span>".</span><span style="color:#96b5b4;">to_string</span><span>(); </span><span style="color:#65737e;">// or String::from_str("hello world")
</span><span> </span><span style="color:#96b5b4;">print_me</span><span>(&owned_string);
</span><span>
</span><span> </span><span style="color:#b48ead;">let</span><span> counted_string = std::rc::Rc::new("</span><span style="color:#a3be8c;">hello world</span><span>".</span><span style="color:#96b5b4;">to_string</span><span>());
</span><span> </span><span style="color:#96b5b4;">print_me</span><span>(&counted_string);
</span><span>
</span><span> </span><span style="color:#b48ead;">let</span><span> atomically_counted_string = std::sync::Arc::new("</span><span style="color:#a3be8c;">hello world</span><span>".</span><span style="color:#96b5b4;">to_string</span><span>());
</span><span> </span><span style="color:#96b5b4;">print_me</span><span>(&atomically_counted_string);
</span><span>}
</span></code></pre>
<p>You can also use Deref coercion with other types, such as a <code>Vector</code>. After all, a <code>String</code> is just a vector of 8-byte <code>chars</code>. Read more about <a href="http://doc.rust-lang.org/nightly/book/deref-coercions.html">Deref coercions</a> in the Rust lang book.</p>
<h2 id="introducing-struct"><a class="zola-anchor" href="#introducing-struct" aria-label="Anchor link for: introducing-struct">Introducing struct</a></h2>
<p>At this point we should be free of extraneous <code>to_string()</code> calls for our functions. However, we run into some problems when we try to introduce a struct. Using what we just learned, we might make a struct like this:</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">struct </span><span>Person {
</span><span> </span><span style="color:#bf616a;">name</span><span>: &</span><span style="color:#b48ead;">str</span><span>,
</span><span>}
</span><span>
</span><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">main</span><span>() {
</span><span> </span><span style="color:#b48ead;">let</span><span> _person = Person { name: "</span><span style="color:#a3be8c;">Herman</span><span>" };
</span><span>}
</span></code></pre>
<p>We get the error:</p>
<pre style="background-color:#2b303b;color:#c0c5ce;"><code><span><anon>:2:11: 2:15 error: missing lifetime specifier [E0106]
</span><span><anon>:2 name: &str,
</span></code></pre>
<p>Rust is trying to ensure that <code>Person</code> does not outlive the reference to <code>name</code>. If <code>Person</code> did manage to outlive <code>name</code>, then we risk our program crashing. The whole point of Rust is to prevent this. So let's start trying to get this code to compile. We need to specify a <a href="http://doc.rust-lang.org/nightly/book/ownership.html#lifetimes">lifetime</a>, or scope, so Rust can keep us safe. The conventional lifetime specifier is <code>'a</code>. I do not know why that was picked, but let's go with that.</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">struct </span><span>Person {
</span><span> </span><span style="color:#bf616a;">name</span><span>: &</span><span style="color:#b48ead;">'a str</span><span>,
</span><span>}
</span><span>
</span><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">main</span><span>() {
</span><span> </span><span style="color:#b48ead;">let</span><span> _person = Person { name: "</span><span style="color:#a3be8c;">Herman</span><span>" };
</span><span>}
</span></code></pre>
<p>Compile again and we get another error:</p>
<pre style="background-color:#2b303b;color:#c0c5ce;"><code><span><anon>:2:12: 2:14 error: use of undeclared lifetime name `'a` [E0261]
</span><span><anon>:2 name: &'a str,
</span></code></pre>
<p>Let's think about this. We know we want to hint to the Rust compiler that our struct <code>Person</code> should not outlive <code>name</code>. So, we need to delcare our lifetime on the <code>Person</code> struct. Some searching will point us to <code><'a></code> being the syntax to declare lifetimes.</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">struct </span><span>Person<</span><span style="color:#b48ead;">'a</span><span>> {
</span><span> </span><span style="color:#bf616a;">name</span><span>: &</span><span style="color:#b48ead;">'a str</span><span>,
</span><span>}
</span><span>
</span><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">main</span><span>() {
</span><span> </span><span style="color:#b48ead;">let</span><span> _person = Person { name: "</span><span style="color:#a3be8c;">Herman</span><span>" };
</span><span>}
</span></code></pre>
<p>This compiles! We normally implement methods on our structs though. Let's add a <code>greet</code> function to our <code>Person</code> class.</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">struct </span><span>Person<</span><span style="color:#b48ead;">'a</span><span>> {
</span><span> </span><span style="color:#bf616a;">name</span><span>: &</span><span style="color:#b48ead;">'a str</span><span>,
</span><span>}
</span><span>
</span><span style="color:#b48ead;">impl </span><span>Person {
</span><span> </span><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">greet</span><span>(&</span><span style="color:#bf616a;">self</span><span>) {
</span><span> println!("</span><span style="color:#a3be8c;">Hello, my name is </span><span style="color:#d08770;">{}</span><span>", </span><span style="color:#bf616a;">self</span><span>.name);
</span><span> }
</span><span>}
</span><span>
</span><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">main</span><span>() {
</span><span> </span><span style="color:#b48ead;">let</span><span> person = Person { name: "</span><span style="color:#a3be8c;">Herman</span><span>" };
</span><span> person.</span><span style="color:#96b5b4;">greet</span><span>();
</span><span>}
</span></code></pre>
<p>We now get the error:</p>
<pre style="background-color:#2b303b;color:#c0c5ce;"><code><span><anon>:5:6: 5:12 error: wrong number of lifetime parameters: expected 1, found 0 [E0107]
</span><span><anon>:5 impl Person {
</span></code></pre>
<p>Our <code>Person</code> struct has a lifetime paremeter so our implementation should have it too. Let's declare our <code>'a</code> lifetime to the implementation of <code>Person</code> like <code>impl Person<'a> {</code>. Unfortunately, this gives us a confusing error when we compile:</p>
<pre style="background-color:#2b303b;color:#c0c5ce;"><code><span><anon>:5:13: 5:15 error: use of undeclared lifetime name `'a` [E0261]
</span><span><anon>:5 impl Person<'a> {
</span></code></pre>
<p>In order for us to <em>declare</em> the lifetime, we need to specify the lifetime right after the <code>impl</code> like <code>impl<'a> Person {</code>. Compile again and we get the error:</p>
<pre style="background-color:#2b303b;color:#c0c5ce;"><code><span><anon>:5:10: 5:16 error: wrong number of lifetime parameters: expected 1, found 0 [E0107]
</span><span><anon>:5 impl<'a> Person {
</span></code></pre>
<p>Now we are back on track. Let's add back our lifetime parameter back to the implementation of <code>Person</code> like <code>impl<'a> Person<'a> {</code>. Now our program compiles. Here is the complete working code:</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">struct </span><span>Person<</span><span style="color:#b48ead;">'a</span><span>> {
</span><span> </span><span style="color:#bf616a;">name</span><span>: &</span><span style="color:#b48ead;">'a str</span><span>,
</span><span>}
</span><span>
</span><span style="color:#b48ead;">impl</span><span><</span><span style="color:#b48ead;">'a</span><span>> Person<</span><span style="color:#b48ead;">'a</span><span>> {
</span><span> </span><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">greet</span><span>(&</span><span style="color:#bf616a;">self</span><span>) {
</span><span> println!("</span><span style="color:#a3be8c;">Hello, my name is </span><span style="color:#d08770;">{}</span><span>", </span><span style="color:#bf616a;">self</span><span>.name);
</span><span> }
</span><span>}
</span><span>
</span><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">main</span><span>() {
</span><span> </span><span style="color:#b48ead;">let</span><span> person = Person { name: "</span><span style="color:#a3be8c;">Herman</span><span>" };
</span><span> person.</span><span style="color:#96b5b4;">greet</span><span>();
</span><span>}
</span></code></pre>
<h3 id="string-or-str-in-struct"><a class="zola-anchor" href="#string-or-str-in-struct" aria-label="Anchor link for: string-or-str-in-struct">String or &str In struct</a></h3>
<p>The question is now whether to use a String or a <code>&str</code> in your struct. In other words when should we use a reference to another type in a struct? We should use a reference if our struct does not need ownership of the variable. This concept might be a little vague, but there are some rules I use to get at an answer.</p>
<ul>
<li>Do I need to use the variable outside of my struct? Here is a contrived example:</li>
</ul>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">struct </span><span>Person {
</span><span> </span><span style="color:#bf616a;">name</span><span>: String,
</span><span>}
</span><span>
</span><span style="color:#b48ead;">impl </span><span>Person {
</span><span> </span><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">greet</span><span>(&</span><span style="color:#bf616a;">self</span><span>) {
</span><span> println!("</span><span style="color:#a3be8c;">Hello, my name is </span><span style="color:#d08770;">{}</span><span>", </span><span style="color:#bf616a;">self</span><span>.name);
</span><span> }
</span><span>}
</span><span>
</span><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">main</span><span>() {
</span><span> </span><span style="color:#b48ead;">let</span><span> name = String::from_str("</span><span style="color:#a3be8c;">Herman</span><span>");
</span><span> </span><span style="color:#b48ead;">let</span><span> person = Person { name: name };
</span><span> person.</span><span style="color:#96b5b4;">greet</span><span>();
</span><span> println!("</span><span style="color:#a3be8c;">My name is </span><span style="color:#d08770;">{}</span><span>", name); </span><span style="color:#65737e;">// move error
</span><span>}
</span></code></pre>
<p>I should use a reference here since I need to use the variable later. Here is a real-world example in <a href="https://github.com/rust-lang/rustc-serialize/blob/master/src/json.rs#L552">rustc_serialize</a>. The <code>Encoder</code> struct does not need to own the <code>writer</code> variable that implements <a href="http://doc.rust-lang.org/nightly/std/fmt/trait.Write.html">std::fmt::Write</a>, just use (borrow) it for a little while. In fact, <code>String</code> implements <code>Write</code>. In this example using the <a href="https://github.com/rust-lang/rustc-serialize/blob/master/src/json.rs#L372">encode</a> function, the variable of type <code>String</code> is passed to the Encoder and then returned to the caller of <code>encode</code>.</p>
<ul>
<li>Is my type large? If the type is large, then passing it by reference will save unncessary memory usage. Remember, passing by reference does not cause a copy of the variable. Consider a String buffer that contains a large amount of data. Copying that around will cause the program to be much slower.</li>
</ul>
<p>We should now be able to create functions that accept strings whether they are <code>&str</code>, <code>String</code> or event reference counted. We are also able to create <code>struct</code>s that are able to have variables that are references. The lifetime of the <code>struct</code> is linked to those referenced variables to make sure that the <code>struct</code> does not outlive the referenced variable and caused bad things to happen in our program. We also have a initial understanding of whether or not the varibles in our <code>struct</code> should be types or references to types.</p>
<h3 id="what-about-static"><a class="zola-anchor" href="#what-about-static" aria-label="Anchor link for: what-about-static">What about 'static</a></h3>
<p>Random aside, but I thought it worth mentioning. We can use a <code>'static</code> lifetime to get our original example to compile, but I caution against it:</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">struct </span><span>Person {
</span><span> </span><span style="color:#bf616a;">name</span><span>: &</span><span style="color:#b48ead;">'static str</span><span>,
</span><span>}
</span><span>
</span><span style="color:#b48ead;">impl </span><span>Person {
</span><span> </span><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">greet</span><span>(&</span><span style="color:#bf616a;">self</span><span>) {
</span><span> println!("</span><span style="color:#a3be8c;">Hello, my name is </span><span style="color:#d08770;">{}</span><span>", </span><span style="color:#bf616a;">self</span><span>.name);
</span><span> }
</span><span>}
</span><span>
</span><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">main</span><span>() {
</span><span> </span><span style="color:#b48ead;">let</span><span> person = Person { name: "</span><span style="color:#a3be8c;">Herman</span><span>" };
</span><span> person.</span><span style="color:#96b5b4;">greet</span><span>();
</span><span>}
</span></code></pre>
<p>The <code>'static</code> lifetime is valid for the entire program. You may not need <code>Person</code> or <code>name</code> to live that long.</p>
<h2 id="related"><a class="zola-anchor" href="#related" aria-label="Anchor link for: related">Related</a></h2>
<ul>
<li><a href="/2015/05/06/creating-a-rust-function-that-accepts-string-or-str.html">Creating a Rust function that accepts String or &str</a></li>
</ul>
http://activitystrea.ms/schema/1.0/postReplace Dicts With Rust Enums2015-04-18T00:00:00+00:002015-04-18T00:00:00+00:00Herman J. Radtke IIIhttp://activitystrea.ms/schema/1.0/personhttps://hermanradtke.com/about/https://hermanradtke.com/2015/04/18/replace-dicts-with-rust-enums.html/<p>I listened to the <a href="https://thechangelog.com/151/">Changelog podcast on Rust</a> recently and loved the remark about enabling web developers to get into systems programming. I have been thinking about the way we design code in the web world versus the way we design code in Rust. In the web world, we often use a hash/dict/map to hide hard-coded values behind a nicer interface. Consider an example where you would want to write a function to create the escape sequence for colors in a TTY terminal. You might write something like this in Ruby:</p>
<pre data-lang="ruby" style="background-color:#2b303b;color:#c0c5ce;" class="language-ruby "><code class="language-ruby" data-lang="ruby"><span style="color:#b48ead;">def </span><span style="color:#8fa1b3;">color</span><span>(</span><span style="color:#bf616a;">fg</span><span>, </span><span style="color:#bf616a;">bg</span><span>=</span><span style="color:#a3be8c;">:default</span><span>)
</span><span> fg_codes = {
</span><span> </span><span style="color:#a3be8c;">:black </span><span>=> </span><span style="color:#d08770;">30</span><span>,
</span><span> </span><span style="color:#a3be8c;">:red </span><span>=> </span><span style="color:#d08770;">31</span><span>,
</span><span> </span><span style="color:#a3be8c;">:green </span><span>=> </span><span style="color:#d08770;">32</span><span>,
</span><span> </span><span style="color:#a3be8c;">:yellow </span><span>=> </span><span style="color:#d08770;">33</span><span>,
</span><span> </span><span style="color:#a3be8c;">:blue </span><span>=> </span><span style="color:#d08770;">34</span><span>,
</span><span> </span><span style="color:#a3be8c;">:magenta </span><span>=> </span><span style="color:#d08770;">35</span><span>,
</span><span> </span><span style="color:#a3be8c;">:cyan </span><span>=> </span><span style="color:#d08770;">36</span><span>,
</span><span> </span><span style="color:#a3be8c;">:white </span><span>=> </span><span style="color:#d08770;">37</span><span>,
</span><span> </span><span style="color:#a3be8c;">:default </span><span>=> </span><span style="color:#d08770;">39</span><span>,
</span><span> }
</span><span> bg_codes = {
</span><span> </span><span style="color:#a3be8c;">:black </span><span>=> </span><span style="color:#d08770;">40</span><span>,
</span><span> </span><span style="color:#a3be8c;">:red </span><span>=> </span><span style="color:#d08770;">41</span><span>,
</span><span> </span><span style="color:#a3be8c;">:green </span><span>=> </span><span style="color:#d08770;">42</span><span>,
</span><span> </span><span style="color:#a3be8c;">:yellow </span><span>=> </span><span style="color:#d08770;">43</span><span>,
</span><span> </span><span style="color:#a3be8c;">:blue </span><span>=> </span><span style="color:#d08770;">44</span><span>,
</span><span> </span><span style="color:#a3be8c;">:magenta </span><span>=> </span><span style="color:#d08770;">45</span><span>,
</span><span> </span><span style="color:#a3be8c;">:cyan </span><span>=> </span><span style="color:#d08770;">46</span><span>,
</span><span> </span><span style="color:#a3be8c;">:white </span><span>=> </span><span style="color:#d08770;">47</span><span>,
</span><span> </span><span style="color:#a3be8c;">:default </span><span>=> </span><span style="color:#d08770;">49</span><span>,
</span><span> }
</span><span> fg_code = fg_codes.fetch(fg)
</span><span> bg_code = bg_codes.fetch(bg)
</span><span> escape "#{fg_code}</span><span style="color:#a3be8c;">;</span><span>#{bg_code}</span><span style="color:#a3be8c;">m</span><span>"
</span><span> </span><span style="color:#b48ead;">end
</span></code></pre>
<p>The foreground and the background color codes are accessed by passing in the names of the color as a symbol via <code>color(:black, :white)</code>. If you tried to port this code directly into Rust, you would run into a few problems. First, there are no symbols in Rust. You might think to work around this by using strings instead of symbols. Then you start looking at the <a href="http://doc.rust-lang.org/std/collections/struct.HashMap.html">documentation for a HashMap</a> and realize maps are quite a bit more involved than in Ruby. After fighting with the syntax for a while, you may come to the conclusion that Rust is too difficult of a language for you to figure out. There is a better way.</p>
<p>Rust has a really powerful feature called <a href="http://doc.rust-lang.org/book/compound-data-types.html#enums">enums</a>. Here is that same code using an enum:</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">enum </span><span>ANSIColor {
</span><span> black,
</span><span> red,
</span><span> green,
</span><span> yellow,
</span><span> blue,
</span><span> magenta,
</span><span> cyan,
</span><span> white,
</span><span> default
</span><span>}
</span><span>
</span><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">color</span><span>(</span><span style="color:#bf616a;">fg</span><span>: ANSIColor, </span><span style="color:#bf616a;">bg</span><span>: ANSIColor) -> String {
</span><span> </span><span style="color:#b48ead;">use </span><span>ANSIColor::*;
</span><span>
</span><span> </span><span style="color:#b48ead;">let</span><span> fg_code = </span><span style="color:#b48ead;">match</span><span> fg {
</span><span> black => </span><span style="color:#d08770;">30</span><span>,
</span><span> red => </span><span style="color:#d08770;">31</span><span>,
</span><span> green => </span><span style="color:#d08770;">32</span><span>,
</span><span> yellow => </span><span style="color:#d08770;">33</span><span>,
</span><span> blue => </span><span style="color:#d08770;">34</span><span>,
</span><span> magenta => </span><span style="color:#d08770;">35</span><span>,
</span><span> cyan => </span><span style="color:#d08770;">36</span><span>,
</span><span> white => </span><span style="color:#d08770;">37</span><span>,
</span><span> default => </span><span style="color:#d08770;">39</span><span>,
</span><span> };
</span><span>
</span><span> </span><span style="color:#b48ead;">let</span><span> bg_code = </span><span style="color:#b48ead;">match</span><span> bg {
</span><span> black => </span><span style="color:#d08770;">40</span><span>,
</span><span> red => </span><span style="color:#d08770;">41</span><span>,
</span><span> green => </span><span style="color:#d08770;">42</span><span>,
</span><span> yellow => </span><span style="color:#d08770;">44</span><span>,
</span><span> blue => </span><span style="color:#d08770;">44</span><span>,
</span><span> magenta => </span><span style="color:#d08770;">45</span><span>,
</span><span> cyan => </span><span style="color:#d08770;">46</span><span>,
</span><span> white => </span><span style="color:#d08770;">47</span><span>,
</span><span> default => </span><span style="color:#d08770;">49</span><span>,
</span><span> };
</span><span>
</span><span> </span><span style="color:#b48ead;">let</span><span> seq = format!("</span><span style="color:#d08770;">{}</span><span style="color:#a3be8c;">;</span><span style="color:#d08770;">{}</span><span style="color:#a3be8c;">m</span><span>", fg_code, bg_code);
</span><span> </span><span style="color:#96b5b4;">escape</span><span>(seq.</span><span style="color:#96b5b4;">as_ref</span><span>())
</span><span>}
</span></code></pre>
<p>You can call the color method via <code>color(ANSIColor::black, ANSIColor::white);</code>. There is a complete, working example on <a href="http://is.gd/OIRlyR">playpen</a>.</p>
<p>I think the use of enums in that code is really expressive. Even more so than ruby: <code>color(:black, :white)</code>. The Rust enum provides the context of what <code>black</code> or <code>white</code> mean to the <code>color</code> function. This also has the benefit of being type-checked by the compiler. If you or anyone else tries to specify a color like <code>ANSIColor::pink</code> the compiler would generate an error for you. This removes the pain of checking for valid colors at runtime, handling those errors at runtime and writing tests around those use-cases. This compile-time checking is what makes Rust such a powerful language.</p>
<p>There are plenty of cases for using dicts/hashes/maps in Rust. However, if you find yourself writing a function that accepts a finite set of options, then I suggest trying to use a Rust enum before resorting to a HashMap.</p>
http://activitystrea.ms/schema/1.0/postGetter Functions In Rust2015-01-14T00:00:00+00:002015-01-14T00:00:00+00:00Herman J. Radtke IIIhttp://activitystrea.ms/schema/1.0/personhttps://hermanradtke.com/about/https://hermanradtke.com/2015/01/14/getters-functions-in-rust.html/<p>As soon as I started writing implementations for structs in Rust I started fighting with the compiler. Writing what seemed like a simple getter function caused me a lot of frustration. The <code>self</code> parameter can really throw me off in Rust. I reflexively treat it like <code>this</code> in C++, which has no concept of <code>&</code> or <code>&mut</code>. I do this because I think of <code>impl Person</code> as defining methods on a <em>class</em> as I would do in C++. This can be really misleading.</p>
<p>Consider this Ruby code that we want to port to Rust: </p>
<pre data-lang="ruby" style="background-color:#2b303b;color:#c0c5ce;" class="language-ruby "><code class="language-ruby" data-lang="ruby"><span style="color:#b48ead;">class </span><span style="color:#ebcb8b;">Person
</span><span style="color:#eff1f5;"> </span><span style="color:#8fa1b3;">attr_reader </span><span style="color:#a3be8c;">:name
</span><span>
</span><span> </span><span style="color:#b48ead;">def </span><span style="color:#8fa1b3;">initialize</span><span>(</span><span style="color:#bf616a;">name</span><span>)
</span><span> @</span><span style="color:#bf616a;">name </span><span>= </span><span style="color:#96b5b4;">name
</span><span> </span><span style="color:#b48ead;">end
</span><span style="color:#b48ead;">end
</span></code></pre>
<p>In Rust this would look something like:</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">struct </span><span>Person {
</span><span> </span><span style="color:#bf616a;">name</span><span>: String
</span><span>}
</span><span>
</span><span style="color:#b48ead;">impl </span><span>Person {
</span><span> </span><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">new</span><span>(</span><span style="color:#bf616a;">name</span><span>: String) -> Person {
</span><span> Person { name: name }
</span><span> }
</span><span>
</span><span> </span><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">get_name</span><span>(</span><span style="color:#bf616a;">self</span><span>) -> String {
</span><span> </span><span style="color:#bf616a;">self</span><span>.name
</span><span> }
</span><span>}
</span><span>
</span><span>#[</span><span style="color:#bf616a;">test</span><span>]
</span><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">test_get_person</span><span>() {
</span><span> </span><span style="color:#b48ead;">let</span><span> p = Person::new("</span><span style="color:#a3be8c;">Herman</span><span>".</span><span style="color:#96b5b4;">to_string</span><span>());
</span><span> assert!(p.</span><span style="color:#96b5b4;">get_name</span><span>().</span><span style="color:#96b5b4;">as_slice</span><span>() == "</span><span style="color:#a3be8c;">Herman</span><span>");
</span><span>}
</span></code></pre>
<p>I run <code>rustc --test person.rs</code>, everything compiles and things are looking good. Even the test passes. What happens if I want to use <code>p</code> again though? If I modify my test to call <code>.get_name()</code> again I receive a cryptic error:</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span>#[</span><span style="color:#bf616a;">test</span><span>]
</span><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">test_get_person</span><span>() {
</span><span> </span><span style="color:#b48ead;">let</span><span> p = Person::new("</span><span style="color:#a3be8c;">Herman</span><span>".</span><span style="color:#96b5b4;">to_string</span><span>());
</span><span> assert!(p.</span><span style="color:#96b5b4;">get_name</span><span>().</span><span style="color:#96b5b4;">as_slice</span><span>() == "</span><span style="color:#a3be8c;">Herman</span><span>");
</span><span> assert!(p.</span><span style="color:#96b5b4;">get_name</span><span>().</span><span style="color:#96b5b4;">as_slice</span><span>() == "</span><span style="color:#a3be8c;">Herman</span><span>");
</span><span>}
</span></code></pre>
<pre data-lang="bash" style="background-color:#2b303b;color:#c0c5ce;" class="language-bash "><code class="language-bash" data-lang="bash"><span style="color:#bf616a;">$</span><span> rustc person.rs
</span><span>
</span><span style="color:#bf616a;">person.rs:21:13:</span><span> 21:14 error: use of moved value: `</span><span style="color:#bf616a;">p</span><span>`
</span><span style="color:#bf616a;">person.rs:21</span><span> assert!(p.get_name().as_slice() =</span><span style="color:#a3be8c;">= </span><span>"</span><span style="color:#a3be8c;">Herman</span><span>");
</span><span> </span><span style="color:#bf616a;">^
</span><span><std </span><span style="color:#bf616a;">macros</span><span>>:1:1: 5:46 note: in expansion of assert!
</span><span style="color:#bf616a;">person.rs:21:5:</span><span> 21:50 note: expansion site
</span><span style="color:#bf616a;">person.rs:20:13:</span><span> 20:14 note: `</span><span style="color:#bf616a;">p</span><span>` moved here because it has type `</span><span style="color:#bf616a;">Person</span><span>`, which is non-copyable
</span><span style="color:#bf616a;">person.rs:20</span><span> assert!(p.get_name().as_slice() =</span><span style="color:#a3be8c;">= </span><span>"</span><span style="color:#a3be8c;">Herman</span><span>");
</span><span> </span><span style="color:#bf616a;">^
</span><span><std </span><span style="color:#bf616a;">macros</span><span>>:1:1: 5:46 note: in expansion of assert!
</span><span style="color:#bf616a;">person.rs:20:5:</span><span> 20:50 note: expansion site
</span><span style="color:#bf616a;">error:</span><span> aborting due to previous error
</span></code></pre>
<p>I read about <a href="http://doc.rust-lang.org/book/ownership.html">ownership in the Rust Book</a> and recall some of what <em>moved</em> means, but it is not clear where to go from here. What many people new to Rust do is resort to using <code>.clone()</code>, but even that will not satisfy the compiler. Thinking back to C++, using a reference makes sense! Let's changing the first parameter to <code>.get_name()</code> from <code>self</code> to <code>&self</code>:</p>
<pre data-lang="bash" style="background-color:#2b303b;color:#c0c5ce;" class="language-bash "><code class="language-bash" data-lang="bash"><span style="color:#bf616a;">$</span><span> rustc person.rs
</span><span>
</span><span style="color:#bf616a;">person.rs:13:7:</span><span> 13:11 error: cannot move out of borrowed content
</span><span style="color:#bf616a;">person.rs:13</span><span> self.name
</span></code></pre>
<p>Whenever I see the word <em>borrowed</em> I know the compiler is referring to someething that is being passed by reference. In this case <code>&self</code> is being passed by reference. The compiler is trying to tell me that it cannot move ownership of <code>name</code> from my borrowed <code>&self</code>. I do not want to give up ownership of <code>name</code> though. I simply want my test to have access to the value for a little while. So, the next step is to return a reference to a String, via <code>&String</code>, so ownership doesn't change. Compiling that shows me:</p>
<pre data-lang="bash" style="background-color:#2b303b;color:#c0c5ce;" class="language-bash "><code class="language-bash" data-lang="bash"><span style="color:#bf616a;">$</span><span> rustc person.rs
</span><span>
</span><span style="color:#bf616a;">person.rs:13:7:</span><span> 13:16 error: mismatched types: expected `&</span><span style="color:#bf616a;">collections::string::String</span><span>`, found `</span><span style="color:#bf616a;">collections::string::String</span><span>` (expected &</span><span style="color:#bf616a;">-ptr,</span><span> found struct collections::string::String)
</span><span style="color:#bf616a;">person.rs:13</span><span> self.name
</span><span> </span><span style="color:#bf616a;">^~~~~~~~~
</span><span style="color:#bf616a;">error:</span><span> aborting due to previous error
</span></code></pre>
<p>This sort of error is very familar in Rust. Turning <code>self.name</code> into a reference via <code>&self.name</code> makes everything compile and leaves us with:</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">struct </span><span>Person {
</span><span> </span><span style="color:#bf616a;">name</span><span>: String
</span><span>}
</span><span>
</span><span style="color:#b48ead;">impl </span><span>Person {
</span><span> </span><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">new</span><span>(</span><span style="color:#bf616a;">name</span><span>: String) -> Person {
</span><span> Person { name: name }
</span><span> }
</span><span>
</span><span> </span><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">get_name</span><span>(&</span><span style="color:#bf616a;">self</span><span>) -> &String {
</span><span> &</span><span style="color:#bf616a;">self</span><span>.name
</span><span> }
</span><span>}
</span><span>
</span><span>#[</span><span style="color:#bf616a;">test</span><span>]
</span><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">test_get_person</span><span>() {
</span><span> </span><span style="color:#b48ead;">let</span><span> p = Person::new("</span><span style="color:#a3be8c;">Herman</span><span>".</span><span style="color:#96b5b4;">to_string</span><span>());
</span><span> assert!(p.</span><span style="color:#96b5b4;">get_name</span><span>().</span><span style="color:#96b5b4;">as_slice</span><span>() == "</span><span style="color:#a3be8c;">Herman</span><span>");
</span><span> assert!(p.</span><span style="color:#96b5b4;">get_name</span><span>().</span><span style="color:#96b5b4;">as_slice</span><span>() == "</span><span style="color:#a3be8c;">Herman</span><span>");
</span><span>}
</span></code></pre>
<h2 id="comparison-to-c"><a class="zola-anchor" href="#comparison-to-c" aria-label="Anchor link for: comparison-to-c">Comparison to C++</a></h2>
<p>What made things really click for me is to think about <a href="https://en.wikipedia.org/wiki/Reference_%28C%2B%2B%29#Uses_of_references">how references work in C++</a>. I also see the <a href="http://doc.rust-lang.org/book/method-syntax.html">Rust Book</a> now includes the language <em>We should default to using <code>&self</code>, as it's the most common.</em> within the context of Rust's methods.</p>
http://activitystrea.ms/schema/1.0/postTerminal Window Size With Rust FFI2015-01-12T00:00:00+00:002015-01-12T00:00:00+00:00Herman J. Radtke IIIhttp://activitystrea.ms/schema/1.0/personhttps://hermanradtke.com/about/https://hermanradtke.com/2015/01/12/terminal-window-size-with-rust-ffi.html/<p>I was writing some code in Rust and wanted to get the size of my terminal. This is currently <a href="https://github.com/rust-lang/rust/blob/470118f3e915cdc8f936aca0640b28a7a3d8dc6c/src/libstd/sys/unix/tty.rs#L44-46">not implemented</a> in Rust though. I decided to read up on <a href="http://static.rust-lang.org/doc/master/book/ffi.html">The Foreign Function Interface Guide</a> to figure out how to do it myself. The Foreign Function Interface (FFI) is how Rust code interfaces with native C code. I also found a great <a href="http://stackoverflow.com/a/1022961/775246">Stack Overflow post</a> that showed me how to write native C to get the terminal size. Based on my research, I needed to do three things in order to get my terminal size:</p>
<ul>
<li>Create a <code>winsize</code> struct in Rust.</li>
<li>Use or externalize the <code>ioctl</code> C function.</li>
<li>Use or externalize the <code>STDOUT_FILENO</code> and <code>TIOCGWINSZ</code> constants.</li>
</ul>
<h2 id="winsize-struct"><a class="zola-anchor" href="#winsize-struct" aria-label="Anchor link for: winsize-struct">Winsize Struct</a></h2>
<p>Creating the <code>winsize</code> struct in Rust is pretty straight forward as Rust has structs too. I first needed to find the definition of <code>winsize</code> in C, so I did some googling and found the <a href="http://unix.superglobalmegacorp.com/Net2/newsrc/sys/ioctl.h.html">sys/ioctl.h source</a>. When defining the struct, we must tell Rust to represent the struct as a C struct using <code>#[repr(C)]</code>. If you read the FFI Guide, then you may be wondering about <code>#[repr(C, packed)]</code>. I talk about packing in more detail at the end of the <a href="https://hermanradtke.com/2015/01/12/terminal-window-size-with-rust-ffi.html/#to-pack-or-not">post</a>. The struct members within <code>winsize</code> are all <code>unsigned short</code>. The C <code>unsigned short</code> is represented in Rust as <code>c_ushort</code> in the <code>libc</code> Rust module. We now have:</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">use </span><span>libc::</span><span style="color:#b48ead;">c_ushort</span><span>;
</span><span>
</span><span>#[</span><span style="color:#bf616a;">repr</span><span>(C)]
</span><span style="color:#b48ead;">struct </span><span>winsize {
</span><span> </span><span style="color:#bf616a;">ws_row</span><span>: c_ushort, </span><span style="color:#65737e;">/* rows, in characters */
</span><span> </span><span style="color:#bf616a;">ws_col</span><span>: c_ushort, </span><span style="color:#65737e;">/* columns, in characters */
</span><span> </span><span style="color:#bf616a;">ws_xpixel</span><span>: c_ushort, </span><span style="color:#65737e;">/* horizontal size, pixels */
</span><span> </span><span style="color:#bf616a;">ws_ypixel</span><span>: c_ushort </span><span style="color:#65737e;">/* vertical size, pixels */
</span><span>}
</span></code></pre>
<h2 id="ioctl"><a class="zola-anchor" href="#ioctl" aria-label="Anchor link for: ioctl">ioctl</a></h2>
<p>Now I need to figure out what to do about the <code>ioctl</code> function. Checking out the Rust docs leads me to the <a href="http://doc.rust-lang.org/libc/funcs/bsd44/fn.ioctl.html">ioctl function signature</a> but I notice that this signature does not look like a variadic function (no varargs). I guess I have to externalize it in my code as a variadic function. I decided to check the Rust source to see if I could find an example of a variadic function and I stumbled in the <a href="https://github.com/rust-lang/rust/blob/5b3cd3900ceda838f5798c30ab96ceb41f962534/src/libstd/sys/unix/c.rs#L78">definition of ioctl</a>. This definition is variadic, so I guess rustdoc does not show this. Strange.</p>
<p>I have read that <code>ws_xpixel</code> and <code>ws_ypixel</code> are not used. I also have no use for them. I still opted to include them in my struct definition as I have no idea what <code>ioctl</code> is doing to that struct.</p>
<p>I have used this word <em>externalized</em> a few times already, so maybe I should now define it. To <em>externalize</em> something is to make that somethings C representation accessible to Rust code. You normally do this with function signatures, constants and global variables. Note that we did not externalize <code>winsize</code>, but instead copied the definition from C to Rust. We cannot externalize <code>winsize</code> as Rust needs to directly manage the definition and memory related to that struct.</p>
<h2 id="the-constants"><a class="zola-anchor" href="#the-constants" aria-label="Anchor link for: the-constants">The Constants</a></h2>
<p>Finally, I need to deal with my constants. I was pretty sure <code>STDOUT_FILENO</code> would already be in Rust. Sure enough, <code>libc::STDOUT_FILENO</code> exists. I was not so lucky with <code>TIOCGWINSZ</code>. The <code>TIOCGWINSZ</code> constant acts as a command to <code>ioctl</code>. If you read the source of <code>sys/ioctl.h</code>, you will notice the value of the commands is based on some rules that encode information to <code>ioctl</code>. There is a fair amount of bit twiddling going on to generate these values. Even if we do the bitwise math by hand, we should still check our work. To do that, I wrote a simple C program that would tell us the proper hex value of <code>TIOCGWINSZ</code>:</p>
<pre data-lang="c" style="background-color:#2b303b;color:#c0c5ce;" class="language-c "><code class="language-c" data-lang="c"><span style="color:#b48ead;">#include </span><span><</span><span style="color:#a3be8c;">sys/ioctl.h</span><span>>
</span><span style="color:#b48ead;">#include </span><span><</span><span style="color:#a3be8c;">stdio.h</span><span>>
</span><span style="color:#b48ead;">#include </span><span><</span><span style="color:#a3be8c;">unistd.h</span><span>>
</span><span>
</span><span style="color:#b48ead;">int </span><span style="color:#8fa1b3;">main </span><span>(</span><span style="color:#b48ead;">int </span><span style="color:#bf616a;">argc</span><span>, </span><span style="color:#b48ead;">char </span><span>**</span><span style="color:#bf616a;">argv</span><span>)
</span><span>{
</span><span> </span><span style="color:#96b5b4;">printf</span><span>("</span><span style="color:#a3be8c;">0x</span><span style="color:#d08770;">%x</span><span>", TIOCGWINSZ);
</span><span> </span><span style="color:#b48ead;">return </span><span style="color:#d08770;">0</span><span>;
</span><span>}
</span></code></pre>
<p>Using this value I can create the same constant in Rust:</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">const </span><span style="color:#d08770;">TIOCGWINSZ</span><span>: </span><span style="color:#b48ead;">c_ulong </span><span>= </span><span style="color:#d08770;">0x40087468</span><span>;
</span></code></pre>
<h2 id="putting-it-all-together"><a class="zola-anchor" href="#putting-it-all-together" aria-label="Anchor link for: putting-it-all-together">Putting It All Together</a></h2>
<p>My function for <code>get_winsize</code> now looks like this:</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">get_winsize</span><span>() -> IoResult<(</span><span style="color:#b48ead;">isize</span><span>, </span><span style="color:#b48ead;">isize</span><span>)> {
</span><span> </span><span style="color:#b48ead;">let</span><span> w = winsize { ws_row: </span><span style="color:#d08770;">0</span><span>, ws_col: </span><span style="color:#d08770;">0</span><span>, ws_xpixel: </span><span style="color:#d08770;">0</span><span>, ws_ypixel: </span><span style="color:#d08770;">0 </span><span>};
</span><span> </span><span style="color:#b48ead;">let</span><span> r = </span><span style="color:#b48ead;">unsafe </span><span>{ </span><span style="color:#96b5b4;">ioctl</span><span>(</span><span style="color:#d08770;">STDOUT_FILENO</span><span>, </span><span style="color:#d08770;">TIOCGWINSZ</span><span>, &w) };
</span><span>
</span><span> </span><span style="color:#b48ead;">match</span><span> r {
</span><span> </span><span style="color:#d08770;">0 </span><span>=> Ok((w.ws_col as </span><span style="color:#b48ead;">isize</span><span>, w.ws_row as </span><span style="color:#b48ead;">isize</span><span>)),
</span><span> _ => {
</span><span> </span><span style="color:#b48ead;">return </span><span>Err(</span><span style="color:#96b5b4;">standard_error</span><span>(ResourceUnavailable))
</span><span> }
</span><span> }
</span><span>}
</span></code></pre>
<p>I initialize my variable containing a <code>winsize</code> struct with values of zero, just like I would <code>memset(w, 0, sizeof(winsize))</code> in C. In order to use the externalized <code>ioctl</code> function, we have to wrap the code in <code>unsafe {}</code> blocks. This informs Rust this code is not to be checked by the compiler for safety. The <code>ioctl</code> function follows the C convention of returning a <code>0</code> for success and a <code>-1</code> for an error. If an error occurs, I decided to throw an existing <code>IoResult</code> error already in Rust. I need to spend a little more time to externalize the <code>errno</code> global variable in C so I can get the exact error. If the function is successful, I return the width and height as a tuple.</p>
<p>Here is a <a href="https://gist.github.com/hjr3/0cbe1ac2f10e6e3df96a">gist</a> of the complete program, including a simple test. This puts all the peices discussed above together and will properly calculate the terminal window size when executed.</p>
<h2 id="to-pack-or-not"><a class="zola-anchor" href="#to-pack-or-not" aria-label="Anchor link for: to-pack-or-not">To Pack Or Not</a></h2>
<p>If you see a struct defined with <code>__attribute__((__packed__))</code> then you need to use <code>#[repr(C, packed)]</code>. Example:</p>
<p>```c`
struct <strong>attribute</strong>((<strong>packed</strong>)) foo {
char first;
int second;
};</p>
<pre style="background-color:#2b303b;color:#c0c5ce;"><code><span>
</span><span>A packed C struct, usually only found in kernel development, is not _padded_. If you are not familiar with _padding_ in C, then you may not understand what `#[repr(C, packed)]` does. When defining a struct in C, the struct members are aligned to _word boundaries_. A _word_ is the natural address boundary for a given architecture. For example, on a 32-bit machine a word is 4 bytes. If a struct member does not align to a word boundary, the compiler will insert padding after the variable. A struct like
</span><span>
</span><span>```c
</span><span>struct foo {
</span><span> char first; // 1 byte
</span><span> int second; // 4 bytes
</span><span>};
</span></code></pre>
<p>is not 5 bytes in size, but 8 bytes due to padding. Here is how the same struct looks after the compiler has added padding:</p>
<pre data-lang="c" style="background-color:#2b303b;color:#c0c5ce;" class="language-c "><code class="language-c" data-lang="c"><span style="color:#b48ead;">struct </span><span>foo {
</span><span> </span><span style="color:#b48ead;">char</span><span> first; </span><span style="color:#65737e;">// 1 byte
</span><span> </span><span style="color:#b48ead;">char</span><span> padding[</span><span style="color:#d08770;">3</span><span>]; </span><span style="color:#65737e;">// 3 bytes
</span><span> </span><span style="color:#b48ead;">int</span><span> second; </span><span style="color:#65737e;">// 4 bytes
</span><span>};
</span></code></pre>
<p>I found a <a href="http://stackoverflow.com/a/4306269/775246">Stack Overflow post</a> the explains it in even greater detail. Also, check out the <a href="http://en.wikipedia.org/wiki/Data_structure_alignment">Data Structure Alignment</a> article on Wikipedia.</p>
http://activitystrea.ms/schema/1.0/postUsing the Nickel.rs Router Macro2015-01-05T00:00:00+00:002015-01-05T00:00:00+00:00Herman J. Radtke IIIhttp://activitystrea.ms/schema/1.0/personhttps://hermanradtke.com/about/https://hermanradtke.com/2015/01/05/hypermedia-api-using-rustlang-nickel.html/<p>The <a href="http://nickel.rs/">nickel.rs Web Application Framework for Rust</a> is inspired by the popular <a href="http://expressjs.com/">node.js express</a> framework. The stated goal of the nickel.rs framework is to show people that it can be easy to write web servers using a sytems language like Rust. I have been using the framework to create hypermredia examples using my hal-rs library.</p>
<p>One of the downsides to a systems language like Rust is the verbosity of the syntax. Someone used to writing in Python or Ruby may be in for quite a shock. I started really feeling this when writing route handlers used by nickel. Here is the simple example from the docs:</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#b48ead;">fn </span><span style="color:#8fa1b3;">a_handler </span><span>(</span><span style="color:#bf616a;">_request</span><span>: &Request, </span><span style="color:#bf616a;">response</span><span>: &</span><span style="color:#b48ead;">mut</span><span> Response) {
</span><span> response.</span><span style="color:#96b5b4;">send</span><span>("</span><span style="color:#a3be8c;">hello world</span><span>");
</span><span>}
</span><span>
</span><span>server.</span><span style="color:#96b5b4;">get</span><span>("</span><span style="color:#a3be8c;">/</span><span>", a_handler);
</span></code></pre>
<p>Notice the <code>_request</code> variable has a leading underscore. This tells the compiler not to throw a warning if this variable is unused. If you do decided to use <code>_request</code> later on, then you need to remember to change the variable name to <code>request</code>. You also need to make a name for the function so it can be referenced in the call to <code>server.get</code>. This sort of boilerplate stuff is not what I want to spend time worrying about. What I really want is a nice looking DSL to describe my routes.</p>
<p>After some perusing of the provided <a href="https://github.com/nickel-org/nickel.rs/tree/master/examples">examples</a> in nickel.rs, I discovered the <code>router!</code> macro. We can use the <code>router!</code> macro to get a DSL-like syntax for routing. Here is the same example above using the <code>router!</code> macro:</p>
<pre data-lang="rust" style="background-color:#2b303b;color:#c0c5ce;" class="language-rust "><code class="language-rust" data-lang="rust"><span>router! {
</span><span> get "</span><span style="color:#a3be8c;">/</span><span>" => |</span><span style="color:#bf616a;">request</span><span>, </span><span style="color:#bf616a;">response</span><span>| {
</span><span> response.</span><span style="color:#96b5b4;">send</span><span>("</span><span style="color:#a3be8c;">hello world</span><span>");
</span><span> }
</span><span>}
</span></code></pre>
<p>When we use the <code>router!</code> macro, it expands into the same Rust code as in our first example. We don't have to think of a name for the function, worry about type of request or response or or type out the <code>server.get</code> line. If you want to see the <code>router!</code> macro used in a real world example, check out the <a href="https://github.com/hjr3/hal-rs-demo/blob/4d0a0ab7a1f69708f0c8a5fa2d6669bed223c67f/src/main.rs#L138-168">index response</a> from my <a href="https://github.com/hjr3/hal-rs-demo/">hal-rs-demo</a> web server. The code for the <code>router!</code> macro is <a href="https://github.com/nickel-org/nickel.rs/blob/b8bb31d0efe47f105f6701f73efe0ecd4a6c83de/nickel_macros/src/macro.rs">here</a>.</p>
<p>The jury is still out on whether or not nickel.rs, or even Rust itself, will be suitable for creating web servers that serve up API responses and HTML to clients. I like many things about Rust though, so I will continue to find out.</p>
http://activitystrea.ms/schema/1.0/postEmberconf 20142014-03-27T00:00:00+00:002014-03-27T00:00:00+00:00Herman J. Radtke IIIhttp://activitystrea.ms/schema/1.0/personhttps://hermanradtke.com/about/https://hermanradtke.com/2014/03/27/emberconf-2014.html/<p>My notes from <a href="http://emberconf.com/">Emberconf</a>.</p>
<h2 id="opening-keynote"><a class="zola-anchor" href="#opening-keynote" aria-label="Anchor link for: opening-keynote">Opening Keynote</a></h2>
<p><em>Open source communities value contributions that are not just code.</em> --- Yehuda Katz</p>
<ul>
<li>Robert Jackson contributed a lot of works to making the 6 week release process much more automated.</li>
<li>A number of people worked to create ember-cli tooling.</li>
<li>Jo Liss created Broccoli.</li>
<li>Leah Silber organized Emberconf.</li>
</ul>
<p>I believe a successful team requires more than just someone who writes the primary code. It requires meta-contributions that make the environment and culture excellent. This is my vision for the HauteLook team and embraces concepts, such as DevOps, as first class citizens of the process.</p>
<h3 id="emphasis-on-screens-and-flows"><a class="zola-anchor" href="#emphasis-on-screens-and-flows" aria-label="Anchor link for: emphasis-on-screens-and-flows">Emphasis on screens and flows</a></h3>
<ul>
<li>Yehuda referenced a Basecamp blog post <a href="http://signalvnoise.com/posts/1926-a-shorthand-for-designing-ui-flows">A shorthand for designing UI flows</a> to emphasize the need to focusing on flows now that Ember has first-class URL support.</li>
</ul>
<p>This was a call to action to start thinking differently about building web applications. The github website was used to demonstrate the complexity of web applications. The github website puts a lot of information on a single screen and uses flows within that screen to organize the experience. There are often flows within flows and we need to start documenting all the states to manage the complexity.</p>
<p>A later talk, given by <a href="https://github.com/nathanhammond">Nathan Hammond</a> introduced <a href="https://github.com/nathanhammond/ember-flows">ember-flow</a> which used a digraph to manage state transitions within an application. Nathan argued that it is our responsibility as application developers not to pollute the browser history with unnecessary transitions of state. To prevent this, he recommended the use of <a href="http://ember-doc.com/classes/Ember.Route.html#method_replaceWith">replaceWith</a> instead of <a href="http://ember-doc.com/classes/Ember.Route.html#method_transitionTo">transitionTo</a>.</p>
<h2 id="ember-components"><a class="zola-anchor" href="#ember-components" aria-label="Anchor link for: ember-components">Ember Components</a></h2>
<p><a href="https://github.com/rpflorence">Ryan Florence</a> gave an inspiring talk about Ember Components. He demonstrateded a number of <a href="http://instructure.github.io/ic-ember/">components</a> that he has built for <a href="https://hermanradtke.com/2014/03/27/emberconf-2014.html/instructure.github.io">instructure</a>. Ryan provided some sage wisdom for component design:</p>
<ol>
<li>Good components usually do not have a template. If your component does have a template, consider breaking it down into multiple components. Ryan used <code><select><option /></select></code> as an example of good component design. The <code>select</code> and <code>option</code> tags are separate components that are related to each other. The <code>select</code> tag may be aware of which <code>option</code> tag is currently selected.</li>
<li>Create a component that groups small components together. Ryan's <a href="https://github.com/instructure/ic-tabs#usage">ic-tabs</a> component groups together many small components together to make it easier for others to use.</li>
<li>The child should inform the parent when it is present. Do <em>not</em> make the parent poll for children components. Ryan used the <code><form></code> tag as an example of how this should work. A form can exist on its own. When adding a <code><button type="submit"></code> tag to the form, the button informs form.</li>
</ol>
<h2 id="distributed-computing"><a class="zola-anchor" href="#distributed-computing" aria-label="Anchor link for: distributed-computing">Distributed Computing</a></h2>
<p><a href="https://github.com/cmeiklejohn/">Christopher Meiklejohn</a> from <a href="http://basho.com/">Basho</a> gave a very dense, but inspiring talk on distributed computing. The main take-away was that <code>ember-data</code> is trying to solve a problem of distrbuted computing.</p>
<ul>
<li>thinkdistributed.io - Christopher hosts a podcast that is a great introduction to the concepts of distributed computing. </li>
<li>https://syncfree.lip6.fr - Christopher is part of an EU project that is trying to solve "Large-scale computation without synchronisation".</li>
</ul>
<h2 id="ember-performance"><a class="zola-anchor" href="#ember-performance" aria-label="Anchor link for: ember-performance">Ember Performance</a></h2>
<p><a href="https://github.com/mixonic">Matthew Beale</a> discussed Ember performance. His book <a href="http://bleedingedgepress.com/our-books/developing-an-ember-edge/">Developing an Ember Edge</a> takes a deeper look into creating performant Ember applications. The main take-away was that a lot of the performance issues may not be related to JavaScript. Look at how the network and the browser painting (animations, rendering) are affecting performance. Ready <a href="http://www.amazon.com/High-Performance-Browser-Networking-performance-ebook/dp/B00FM0OC4S">High Performance Browser Networking</a> for more insight into how to diagnose and resolve network issues.</p>
<p>Matthew did point out that <a href="http://emberjs.com/guides/object-model/observers/">observers</a> are actually a synchronous operation. Expensive functions that are observing a property should be handled with care. He suggested the use of <code>setProperties</code>, <code>Ember.run.once</code> and <code>pushObjects</code> to work around performance issues. Yehuda commented that the core team considers the synchronous nature of observers a bug, but it is hard to fix without breaking existing Ember 1.0 compatibility.</p>
<h2 id="other-links-related-to-talks"><a class="zola-anchor" href="#other-links-related-to-talks" aria-label="Anchor link for: other-links-related-to-talks">Other Links Related To Talks</a></h2>
<ul>
<li><a href="http://www.solitr.com/blog/2014/02/broccoli-first-release/">Broccoli: First Beta Release</a> - Broccoli is an awesome build tool created by <a href="https://github.com/joliss">Jo Liss</a>.</li>
<li><a href="http://extensiblewebmanifesto.org/">The Extensible Web Manifesto</a> was part of the closing keynote given by <a href="https://github.com/dherman">David Herman</a></li>
</ul>
<h2 id="sketches-of-talks"><a class="zola-anchor" href="#sketches-of-talks" aria-label="Anchor link for: sketches-of-talks">Sketches of Talks</a></h2>
<p><a href="https://twitter.com/chantastic">Michael Chan</a> sketched notes from each talk. Links to his Sketch for each talk are below:</p>
<ul>
<li><a href="https://twitter.com/chantastic/status/448517744900976641">Opening Keynote</a></li>
<li><a href="https://twitter.com/chantastic/status/448551975949721600">Using Ember To Make The Impossible Possible</a></li>
<li><a href="https://twitter.com/chantastic/status/448552274043080704">Contributing To Ember</a></li>
<li><a href="https://twitter.com/chantastic/status/448579736629809152">Ember Data And The Way Forward</a></li>
<li><a href="https://twitter.com/chantastic/status/448591731869511680">Broccoli</a></li>
<li><a href="https://twitter.com/chantastic/status/448592076557406208">Animations And Transitions</a></li>
<li><a href="https://twitter.com/chantastic/status/448640475071668224">Angular Directives</a></li>
<li><a href="https://twitter.com/chantastic/status/448640855876710400">Modeling The App Store</a></li>
<li><a href="https://twitter.com/chantastic/status/448641160433512448">HTMLBars - The Next Generation Of Templating</a></li>
<li><a href="https://twitter.com/chantastic/status/448877762812444672">The {{x-foo}} In You</a></li>
<li><a href="https://twitter.com/chantastic/status/448890153340137472">ember-cli</a></li>
<li><a href="https://twitter.com/chantastic/status/448901778449244160">Ember Is For The Children</a></li>
<li><a href="https://twitter.com/chantastic/status/448943222685847552">Query Params In Ember</a></li>
<li><a href="https://twitter.com/chantastic/status/448951800725377024">Ember Testing</a></li>
<li><a href="https://twitter.com/chantastic/status/448954197459759105">Covergent/Divergent</a></li>
<li><a href="https://twitter.com/chantastic/status/448968123509530624">Controlling Route Traversal With Flows</a></li>
<li><a href="https://twitter.com/chantastic/status/448979625612304384">Snappy Means Happy: Performance In Ember Apps</a></li>
<li><a href="https://twitter.com/chantastic/status/449004904363732992">Closing Keynote</a></li>
</ul>
http://activitystrea.ms/schema/1.0/postTesting Repeated Elements With Behat+Mink2014-03-21T00:00:00+00:002014-03-21T00:00:00+00:00Herman J. Radtke IIIhttp://activitystrea.ms/schema/1.0/personhttps://hermanradtke.com/about/https://hermanradtke.com/2014/03/21/testing-repeated-elements-with-behat-mink.html/<p>The Mink extension to behat makes it really easy to test the contents of a page. I can use the <code>assertElementContainsText</code> feature to assert that some text exists within a certain element:</p>
<p>{% highlight gherkin %}
Then I should see "My Page Title" in the "h1" element.
{% endhighlight %}</p>
<p>If there is more than one <code>h1</code> element on the page, I can use a css selector to increase specificity:</p>
<p>{% highlight gherkin %}
Then I should see "My Page Title" in the "h1.page-title" element.
{% endhighlight %}</p>
<p>This is really powerful and css selectors make it pretty easy to identify most elements on a page. However, given the block of html below, how do I test the text in a repeated element?</p>
<p>{% highlight html %}</p>
<div class="items">
<div class="item-row">
<p class="item-row-name">Item 1</p>
</div>
<div class="item-row">
<p class="item-row-name">Item 2</p>
</div>
<div class="item-row">
<p class="item-row-name">Item 3</p>
</div>
</div>
{% endhighlight %}
<p>My initial attempt was to use a more advanced css selector:</p>
<p>{% highlight gherkin %}
Then I should see "Item 1" in the "div.item-row:nth(0)" element.
{% endhighlight %}</p>
<p>Unfortuntely, the Symfony 2 web driver does not support this syntax. After talking with a few colleagues, I decided to create a feature that allowed me to do this. Here is an example of my feature syntax:</p>
<p>{% highlight gherkin %}
Then I should see the following in the repeated "div.item-row-name" element within the context of the "div.items" element:
| text |
| Item 1 |
| Item 2 |
| Item 3 |
{% endhighlight %}</p>
<p>Here is what the code looks like:</p>
<p>{% highlight php %}</p>
<?php
/**
* @Then /^(?:|I )should see the following in the repeated "(?P<element>[^"]*)" element within the context of the "(?P<parentElement>[^"]*)" element
*/
public function assertRepeatedElementContainsText(TableNode $table, $element, $parentElement)
{
$parent = $this->getSession()->getPage()->findAll('css', $parentElement);
foreach ($table->getHash() as $n => $repeatedElement) {
$child = $parent[$n];
\PHPUnit_Framework_Assert::assertEquals(
$child->find('css', $element)->getText(),
$repeatedElement['text']
);
}
}
{% endhighlight %}
I take advantage of the fact that the repeated elements have a common parent `div.items`. I find all children of the `div.items` element using the Mink `find` API. I can loop over the children and take advantage of the fact that the children are of type `NodeElement`. The `NodeElement` class shares the same base class as `DocumentElement` object returned from `$this->getSession()->getPage()` call. When I use the `find` method on the `$child` object, I will only search for elements that are within the context of the current child. Here is what that context looks like on the first iteration:
{% highlight html %}
<div class="item-row">
<p class="item-row-name">Item 1</p>
</div>
{% endhighlight %}
Now when I search for the element matching `div.item-row-name`, I only get back the _one_ element within this context. I can then assert that this text within this element matches the corresponding item in my table.
Notice that I use a PHPUnit assertion within this feature. I would have preferred to re-use an existing Mink web assertion, but all of the assertions assume a global context. Look at the [elementTextContains]() code to see what I mean.
Happy Hacking!
http://activitystrea.ms/schema/1.0/postHypermedia Services and MVC2014-01-04T00:00:00+00:002014-01-04T00:00:00+00:00Herman J. Radtke IIIhttp://activitystrea.ms/schema/1.0/personhttps://hermanradtke.com/about/https://hermanradtke.com/2014/01/04/hypermedia-services-and-mvc.html/<p>This blog post is in response to the discussion started by the <a href="https://twitter.com/nateabele/statuses/418965626410270720">tweet</a> below:</p>
<blockquote class="twitter-tweet" lang="en" align="center"><p>Incidentally, MVC is a REALLY poor fit for designing hypermedia services. So that's fun, everyone who thought you knew what you were doing.</p>— Nate Abele (@nateabele) <a href="https://twitter.com/nateabele/statuses/418965626410270720">January 3, 2014</a></blockquote>
<script async="async" src="//platform.twitter.com/widgets.js" charset="utf-8"></script>
<p>This post is not intended to present the best way to design and build hypermedia services. My goal is present a high level description of how we have built and designed a hypermedia API at HauteLook within the context of the MVC Architectual Pattern. That being said, I am always looking for ways to improve.</p>
<p>Let's start with the controller. According to the <a href="http://www.amazon.com/Design-Patterns-Elements-Reusable-Object-Oriented/dp/0201633612">Gang of Four</a>, the controller "defines the way the user interface reacts to user input". In the context of a hypermedia service, <em>user input</em> comes in the form of a request. The request is most commonly sent using HTTP, but could use another protocol such as FTP. We will be assuming HTTP for the rest of this post. The <em>user interface</em> is the API response the server sends back to the client. In a hypermedia API, the response is based on which resources the client (user) is requesting and what media type the client prefers.</p>
<p>Let us use the example below. We can assume that the client is knowledgeable of the <code>/users/42</code> URL because it made a HTTP GET request to <code>/</code> and received a list of user relations it can then request.</p>
<pre style="background-color:#2b303b;color:#c0c5ce;"><code><span>GET /users/42 HTTP\1.1
</span><span>Host: example.com
</span><span>Accept: application/hal+json
</span></code></pre>
<p>The controller will first fetch the user resource. If successful, the controller will specify a 200 response code, any necessary headers and then pass that resource to the view along with the desired client's desired media type. However, there are many reasons why the request may not be successful. The resource <code>/users/42</code> may not exist, the client may not be authorized to make the request, pre-conditions of the request may not be satisifed or any other number of problems. In any of those error cases, the controller will issue the proper response code, headers and any body necessary to represent the current state to the client.</p>
<p>The resources in our hypermedia API are models. We have some code samples of models for address and user resources below. I chose ruby as it makes the code very concise. How exactly these models are populated is an exercise left up to the reader. What we do <strong>not</strong> want to do is simply transfer the data in our persistence layer (i.e. database) directly to the client. If we are using something like a relational database, there may be many tables required to accurately represent one resource. Take a look at the <code>Address</code> model below. We may need to execute a SQL statement that joins some hypothetical <code>addresses</code>, <code>states</code> and <code>countries</code> tables together in order to create address resources. Our goal is to encapsulate the resources you want to represent to the client. If the underlying persistence layer changes, the resource should not change.</p>
<pre data-lang="ruby" style="background-color:#2b303b;color:#c0c5ce;" class="language-ruby "><code class="language-ruby" data-lang="ruby"><span style="color:#b48ead;">class </span><span style="color:#ebcb8b;">Address
</span><span style="color:#eff1f5;"> </span><span style="color:#8fa1b3;">attr_accessor </span><span style="color:#a3be8c;">:line1</span><span>, </span><span style="color:#a3be8c;">:line2</span><span>, </span><span style="color:#a3be8c;">:state</span><span>, </span><span style="color:#a3be8c;">:country</span><span>,
</span><span> </span><span style="color:#a3be8c;">:postal_code
</span><span style="color:#b48ead;">end
</span><span>
</span><span style="color:#b48ead;">class </span><span style="color:#ebcb8b;">User
</span><span style="color:#eff1f5;"> </span><span style="color:#8fa1b3;">attr_accessor </span><span style="color:#a3be8c;">:user_id</span><span>, </span><span style="color:#a3be8c;">:email</span><span>, </span><span style="color:#a3be8c;">:first_name</span><span>, </span><span style="color:#a3be8c;">:last_name</span><span>,
</span><span> </span><span style="color:#a3be8c;">:addresses
</span><span>
</span><span> </span><span style="color:#b48ead;">def </span><span style="color:#8fa1b3;">initialize</span><span>(</span><span style="color:#bf616a;">addresses</span><span>)
</span><span> @</span><span style="color:#bf616a;">addresses </span><span>= addresses
</span><span> </span><span style="color:#b48ead;">end
</span><span>
</span><span> </span><span style="color:#b48ead;">def </span><span style="color:#8fa1b3;">fullName</span><span>()
</span><span> "#{first_name} #{last_name}"
</span><span> </span><span style="color:#b48ead;">end
</span><span style="color:#b48ead;">end
</span></code></pre>
<p>Once we have our resources created, we need to represent them to the client using the view. The view is only concerned with how we are presenting our resource to the client. It is not scalable to explicitly write out every resource/media-type combination. We want to <em>describe</em> how to represent a resource in an agnostic way. We then feed those descriptions into a serializer that is aware of HAL, Atom, JSON-API, etc and can generate the output based on the desired media type of the client.</p>
<p>Here is an example DSL that describes how to represent a user resource to the client:</p>
<pre data-lang="yaml" style="background-color:#2b303b;color:#c0c5ce;" class="language-yaml "><code class="language-yaml" data-lang="yaml"><span style="color:#bf616a;">relations</span><span>:
</span><span> </span><span style="color:#bf616a;">self</span><span>:
</span><span> </span><span style="color:#bf616a;">href</span><span>:
</span><span> </span><span style="color:#bf616a;">route</span><span>: </span><span style="color:#a3be8c;">/users/:user_id
</span><span> </span><span style="color:#bf616a;">params</span><span>:
</span><span> </span><span style="color:#bf616a;">user_id</span><span>: </span><span style="color:#a3be8c;">id
</span><span>
</span><span> </span><span style="color:#bf616a;">/rels/orders</span><span>:
</span><span> </span><span style="color:#bf616a;">href</span><span>:
</span><span> </span><span style="color:#bf616a;">route</span><span>: </span><span style="color:#a3be8c;">/orders?user_id=:user_id
</span><span> </span><span style="color:#bf616a;">params</span><span>:
</span><span> </span><span style="color:#bf616a;">user_id</span><span>: </span><span style="color:#a3be8c;">id
</span><span>
</span><span> </span><span style="color:#bf616a;">/rels/addresses</span><span>:
</span><span> </span><span style="color:#bf616a;">href</span><span>:
</span><span> </span><span style="color:#bf616a;">route</span><span>: </span><span style="color:#a3be8c;">/users/:user_id/addresses
</span><span> </span><span style="color:#bf616a;">params</span><span>:
</span><span> </span><span style="color:#bf616a;">user_id</span><span>: </span><span style="color:#a3be8c;">id
</span><span> </span><span style="color:#bf616a;">embed</span><span>:
</span><span> </span><span style="color:#bf616a;">property</span><span>: </span><span style="color:#a3be8c;">addresses
</span><span>
</span><span style="color:#bf616a;">properties</span><span>:
</span><span> </span><span style="color:#bf616a;">email</span><span>: </span><span style="color:#a3be8c;">email
</span><span> </span><span style="color:#bf616a;">name</span><span>: </span><span style="color:#a3be8c;">fullName
</span></code></pre>
<p>The user class is serialized into the correct media type based on the description. The <code>relations</code> section describes how the user relates to other resources in the API. The <code>properties</code> section describes how to show the resource to the client. In this case, we do not show first and last name separate properties but instead show one <code>name</code> property. More importantly, we do not include the user_id in the response. It is probably not important to the client what the user id is, except to create URLs. We have the relations to avoid client-side URL generation though. Also, notice how the addresses are embedded in the representation of a user. The serializer would look up how to represent the address and serialize it accordingly.</p>
<p>The above DSL is loosely based on the <a href="http://hateoas-php.org/">Symfony 2 HATEOAS bundle</a> that we use at HauteLook. Reading through the documentation you may notice that it uses funky PHP annotations. This is just a preference of the maintainers. There is also built-in support to use separate PHP or XML files to describe the view.</p>
<p>This architecture has made it fairly straight-forward to design and maintain a hypermedia API. The risk of changes at persistent layer leaking into the client representations is low. The models encapsulate the resources and we can make them as resistant to change as we want. The view presents the representation of the resource and how the resource relates to other resources to the client.</p>
http://activitystrea.ms/schema/1.0/postPartnerships: Dave Ramsey vs technology startups2013-09-01T00:00:00+00:002013-09-01T00:00:00+00:00Herman J. Radtke IIIhttp://activitystrea.ms/schema/1.0/personhttps://hermanradtke.com/about/https://hermanradtke.com/2013/09/01/partnerships-dave-ramsey-vs-startups.html/<p>Dave Ramsey has a famous quote: "The only ship that won't sail is a partnership" <a href="http://www.daveramsey.com/index.cfm?event=askdave/&intContentItemId=123051">(source)</a>. I believe the context is in regards to two friends starting a business together. I recently heard this together and wondered how many succesful startups were partnerships.</p>
<p>I did a quick <a href="http://www.crunchbase.com/search/advanced/companies/2039685">search</a> on CrunchBase for startups that were acquired for over 1 million dollars since January 2010. The results show 12 out of 20 startups were partnerships. I cannot tell how many were friends prior to the forming of the company. This does not show me any startups went IPO either.</p>
<p>List of startups that were partnerships:</p>
<ul>
<li>Instagram</li>
<li>Rapportive</li>
<li>adGrok</li>
<li>Mobile Theory</li>
<li>Screaming Daily Deals</li>
<li>Socialcam</li>
<li>condaptive</li>
<li>LovingEco</li>
<li>seatme</li>
<li>Pulse</li>
<li>carbyn</li>
<li>Crashlytics</li>
<li>BrightNest</li>
</ul>
<p>I would like to look back 10 years, but there is no way to filter the CrunchBase search for partnerships. I may spend some more time compiling a database. I am also curious as to the division of labor amongst the co-founders. I think partnerships have the potential for more problems when two or more of the co-founders have the same skillset.</p>
http://activitystrea.ms/schema/1.0/postDRY and Clean Interfaces2013-03-16T00:00:00+00:002013-03-16T00:00:00+00:00Herman J. Radtke IIIhttp://activitystrea.ms/schema/1.0/personhttps://hermanradtke.com/about/https://hermanradtke.com/2013/03/16/dry-and-clean-interfaces.html/<p>The principle of Don't Repeat Yourself (DRY) is more than just grouping common code together. When trying to apply the DRY principle, it is easy to start making a mess of a class interface. I recently had to write some code to generate Flickr image URLs from an API response. I needed to generate two types of URLs: a thumbnail and a normal image. Here is one version of code reuse:</p>
<p>{% highlight objective-c %}</p>
<ul>
<li>
<p>(NSString *)generateFlickrImageUrl:(NSDictionary *)photo withImageType:(NSString *) imageType
{
NSString *imageSize;</p>
<p>if ([imageType isEqualToString:@"thumbnail"]) {
imageSize = @"t";
} else {
imageSize = @"z";
}</p>
<p>NSString *farm = [photo valueForKey:@"farm"];
NSString *server = [photo valueForKey:@"server"];
NSString *photoId = [photo valueForKey:@"id"];
NSString *secret = [photo valueForKey:@"secret"];
NSString *url = [NSString stringWithFormat:@"http://farm%@.staticflickr.com/%@/%@<em>%@</em>%@.jpg", farm, server, photoId, secret, imageSize];</p>
<p>return url;
}
{% endhighlight %}</p>
</li>
</ul>
<p>While this does reuse code, it is bad because the code is not accepting of change. If I have to add some other sort of image type, then I have to modify this function. This is a big red flag. When you start using method parameters as an extension of your interface, you may be making the code hard to change. Also, anyone using this class will have to look for the list of available options for the <code>imageType</code> parameter. I think it is better to design a easy to understand interface instead.</p>
<p>{% highlight objective-c %}</p>
<ul>
<li>
<p>(NSString *)generateFlickrImageUrl:(NSDictionary *)photo withImageSize:(NSString *) imageSize
{
NSString *farm = [photo valueForKey:@"farm"];
NSString *server = [photo valueForKey:@"server"];
NSString *photoId = [photo valueForKey:@"id"];
NSString *secret = [photo valueForKey:@"secret"];
NSString *url = [NSString stringWithFormat:@"http://farm%@.staticflickr.com/%@/%@<em>%@</em>%@.jpg", farm, server, photoId, secret, imageSize];</p>
<p>return url;
}</p>
</li>
<li>
<p>(NSString *)generateImageUrlThumbnail:(NSDictionary *)photo
{
return [self generateFlickrImageUrl:photo withImageSize:@"t"];
}</p>
</li>
<li>
<p>(NSString *)generateImageUrl:(NSDictionary *)photo
{
return [self generateFlickrImageUrl:photo withImageSize:@"z"];
}
{% endhighlight %}</p>
</li>
</ul>
<p>The <code>generateFlickrImageUrl</code> is now a protected method of the class while the <code>generateImageUrl</code> and <code>generateImageUrlThumbnail</code> methods define the public interface.</p>
http://activitystrea.ms/schema/1.0/postMisunderstanding DRY2013-02-06T00:00:00+00:002013-02-06T00:00:00+00:00Herman J. Radtke IIIhttp://activitystrea.ms/schema/1.0/personhttps://hermanradtke.com/about/https://hermanradtke.com/2013/02/06/misunderstanding-dry.html/<p>I think the Object Oriented (OO) principle of Don't Repeat Yourself (DRY) is often misunderstood. In particular, the word "repeat" is troublesome. It has nothing to do with minimizing the amount of code you write. It is also not about merging similar methods together into a super-method. The DRY principle is about preserving a single source of truth in a system. When a there are multiple sources of truth in a system we have to write more code to manually keep all the truths in sync with each other. This often leads to unintended consequences to a part of the system when a change is made to a different part of the system. We are left to look through the code looking for these unintended consequences and become increasingly reluctant to change. Properly applying the DRY principle protects us from these unintended consequences and can make our code much more accepting of change.</p>
<p>To those less experienced, DRYing up some parts of the code may not seem like a waste of time in the present. I want to use some code samples in an attempt to prove that fixing even simple DRY violations can be very helpful. The following class is a simple <code>Car</code> class. It starts out with a single method <code>currentSpeed()</code> which returns the current speed of the <code>Car</code> instance. In a real class, there would be more implementation detail. For now we are just concerned with the speed of the <code>Car</code> class. We will change the <code>Car</code> and use the DRY principles to help us design good code.</p>
<pre style="background-color:#2b303b;color:#c0c5ce;"><code><span>class Car
</span><span>{
</span><span> protected $speed;
</span><span>
</span><span> public function currentSpeed() { return $this->speed; }
</span><span>}
</span></code></pre>
<p>At this point the class seems pretty reasonable. The <code>$speed</code> member variable stores the speed of the <code>Car</code> instance. The <code>currentSpeed()</code> method simply returns the speed. Now let us pretend that we need to add the logic for cruise control. A basic cruise control system is made up of four operations: toggle, set, cancel and resume. The toggle operation turns the cruise control on and off. The set operation will determine the speed of the <code>Car</code> object and maintain that speed. The cancel operation instructs the cruise control system to stop maintaining the set speed. The resume operation signals the cruise control system to accelerate to the set speed and then maintain that speed. The toggle operation is uninteresting, so let's start implementing the set operation. We might do something like this:</p>
<pre style="background-color:#2b303b;color:#c0c5ce;"><code><span>class Car
</span><span>{
</span><span> protected $speed;
</span><span> protected $cruisingSpeed;
</span><span>
</span><span> public function currentSpeed() { return $this->speed; }
</span><span>
</span><span> public function cruiseControlSet() {
</span><span> $this->cruisingSpeed = $this->speed;
</span><span> }
</span><span>}
</span></code></pre>
<p>This is a simple change, but we actually just violated the DRY principle. The <code>cruiseControlSet()</code> method should not have direct access to the <code>$speed</code> member variable. Good OO design focuses on passing around message (or methods) and not data. The <code>$speed</code> member variable is data. We should use the <code>currentSpeed()</code> method to <em>ask</em> for the speed. I made a special point to use the word <em>ask</em> in the previous sentence. Our current implementation is not asking for anything. It knows <em>how</em> the <code>$speed</code> data is stored within <code>Car</code> class. What is the big deal though, right? It is obvious by looking at this code that two different methods are accessing <code>$speed</code>. If we are going to change how <code>$speed</code> works later on, we can deal with it then. YAGNI bro!</p>
<p>Trying to predict the future is sure way to make your code design overly complex. The principle of You Aren't Gonna Need It (YAGNI) addresses this concern. However, we have already established a pattern here. More than one method needs to know the speed of the <code>Car</code> object. There is a good chance that more methods will need to know the speed as well. It is also important to consider the cost of making a change. In this case, the cost of changing <code>cruiseControlSet()</code> to use the <code>currentSpeed()</code> method instead of directly accessing <code>$speed</code> is very low. When the cost is low, err on the side of good OO design in an attempt to make change easy. Do it even if you are sure that <code>$speed</code> will never change.</p>
<p>Something else starts to become apparent as we add more of the cruise control functionality to the <code>Car</code> class. Let's add the other methods and see if we can spot it.</p>
<pre style="background-color:#2b303b;color:#c0c5ce;"><code><span>class Car
</span><span>{
</span><span> protected $speed;
</span><span> protected $cruisingSpeed;
</span><span>
</span><span> public function currentSpeed() { return $this->speed; }
</span><span> public function cruisingSpeed() { return $this->cruisingSpeed; }
</span><span>
</span><span> public function cruiseControlToggle() { ... }
</span><span>
</span><span> public function cruiseControlSet() {
</span><span> $this->cruisingSpeed = $this->currentSpeed();
</span><span> }
</span><span>
</span><span> public function cruiseControlCancel() { ... }
</span><span>
</span><span> public function cruiseControlResume() {
</span><span> if ($this->currentSpeed() != $this->cruisingSpeed()) {
</span><span> ...
</span><span> }
</span><span> }
</span><span>}
</span></code></pre>
<p>As we are adding the cruise control functionality to the <code>Car</code> class something starts to feel wrong. The class is getting large in a hurry. Also, our tests may be getting harder to setup. This functionality is screaming to be refactored into a separate class. Another hint is that we started using a common method prefix of <code>cruiseControl</code>. Whenever this happens, we should really consider if this functionality is part of this class. Let's move all the <code>cruiseControl*()</code> methods and the <code>cruisingSpeed()</code> method into another class. Watch closely how the DRY principle helps us minimize the amount of changes we make in this refactor.</p>
<pre style="background-color:#2b303b;color:#c0c5ce;"><code><span>class Car
</span><span>{
</span><span> protected $speed;
</span><span> protected $cruisingSpeed;
</span><span>
</span><span> public function currentSpeed() { return $this->speed; }
</span><span>}
</span><span>
</span><span>class CruiseControl
</span><span>{
</span><span> public function cruisingSpeed() { return $this->cruisingSpeed; }
</span><span>
</span><span> public function toggle() { ... }
</span><span>
</span><span> public function set() {
</span><span> $this->cruisingSpeed = $this->currentSpeed();
</span><span> }
</span><span>
</span><span> public function cancel() { ... }
</span><span>
</span><span> public function resume() {
</span><span> if ($this->currentSpeed() != $this->cruisingSpeed()) {
</span><span> ...
</span><span> }
</span><span> }
</span><span>
</span><span> public function __construct(Car $car)
</span><span> {
</span><span> $this->car = $car;
</span><span> }
</span><span>
</span><span> protected function currentSpeed()
</span><span> {
</span><span> return $this->car->currentSpeed();
</span><span> }
</span><span>}
</span></code></pre>
<p>Notice how the methods that implement our cruise control operations are still using the <code>currentSpeed()</code> method. They did not have to change because we just added a protected method to get the current speed of the car. We hide away the knowledge of where the speed is coming from as these methods are not concerned with that specific knowledge. We are able to do this because the cruise control is given access to the public interface of the <code>Car</code> class and can then determine the speed. This forces the cruise control system to <em>ask</em> the <code>Car</code> class to do things. For example, the cruise control system no longer has the potential to change the <code>$speed</code> of the <code>Car</code> class. If it wants to change speeds, it must <em>ask</em> a <code>Car</code> object to accelerate or decelerate.</p>
<p>I hope I have shown the benefits of DRYing up code, even when the DRY violations appear to be harmless. You may still have some reservations about the benefits of the DRY principle. I encourage you to put the those reservations aside for a period of time and adhere to those principles. Chances are you will notice an increase in the quality of design.</p>
http://activitystrea.ms/schema/1.0/postWhat do MySQL datetime types and scrum have in common?2013-01-19T00:00:00+00:002013-01-19T00:00:00+00:00Herman J. Radtke IIIhttp://activitystrea.ms/schema/1.0/personhttps://hermanradtke.com/about/https://hermanradtke.com/2013/01/19/datetime-scrum.html/<p>I have always wondered why the MySQL datetime type only has second precision. Developers will happily put a datetime type in a unique index though. They justify that decision by telling themselves, "Two rows inserted within the same second will never happen". That same developer just got done reading the latest MongoDB article that benchmarked a gazillion inserts a second. They tweeted it too. I suggest that we use a bigint and store a timestamp with greater precision if we are to use it as a unique index (or use Postgres). "A bigint takes up too much space!". Cool guid there, bro. So what does this tangent have to do with scrum? This whole notion of 2-4 week sprints for development is akin to using second precision when a <a href="http://www.fusionio.com/data-sheets/iodrive2-duo/">Fusion ioDrive2</a> can rock 500 million write IOPS.</p>
<p>I just have this notion in my head that 2-4 week sprints are not condusive to good business. I may be wrong, but <a href="http://blog.expensify.com/2013/01/11/ceo-friday-startup-best-practices-95-failure-rate/">95% of startups fail</a> anyways so you might as well try something different. I prefer 1-2 day iterations. It may sound crazy to break a 4 month project down into 1-2 day iterations, but I think the end result will be significantly better. It distills a project or feature down to what is most important. With 2 week iterations, going through 4 iterations of a product takes 2 months. That feels like a lifetime to me. The quarter is nearly over by then. I hope your (educated) guess was right because you have almost no time to try something else. I am not even confining this 1-2 day iteration idea to startups either. I think it is more important for larger businesses because they have more to gain (and lose).</p>
<p>I expect this 1-2 day pace to feel chaotic to most people. Most people want to establish some kind of process so they feel like they have some control. A 2-4 week sprint is great because we now have time to estimate. With estimates we now have visibility into the healthiness of the project through burndown charts and can measure things like team velocity. When did "team velocity" suddenly become more important than growing the business? They would rather sit up in the ivory tower and watch Uncle Bob discuss the best practice for the number of lines of a class method instead.</p>
<p>Grow the business. And stop using MySQL datetime fields in unique indices.</p>
http://activitystrea.ms/schema/1.0/postDo Not Be Afraid Of New Technology2012-12-30T00:00:00+00:002012-12-30T00:00:00+00:00Herman J. Radtke IIIhttp://activitystrea.ms/schema/1.0/personhttps://hermanradtke.com/about/https://hermanradtke.com/2012/12/30/do-not-be-afraid-of-new-technology.html/<p>I think the <a href="http://lucumr.pocoo.org/2012/12/29/sql-is-agile/">SQL is Agile</a> post by Armin Ronacher is a little short sighted. <em>Disclaimer: I rarely use MongoDB or other NoSQL offerings</em>. There are plenty of good reasons to use MongoDB in production. Many people and companies have shown it to be successful. More importantly we should not be afraid of trying out new things, especially for side projects. We should be cautious of trying new things for important projects. By "important", I mean those projects that need to be reliable or where there is some sort of SLA. This important distinction between types of projects is not made in the article. If we are building some toy application or program, then I see no problem considering MongoDB. For those 1-2% of toy projects that take off and need to scale big, you will have some growing pains with MongoDB. However, you will always have growing pains. Even with SQL there will be growing pains. There are always new ideas coming out about how we can use relational databases to scale. At one point there was no such thing as the idea of sharding or master/slave replication. These ideas came about as SQL became more mature. There may be another breakthrough that no one has thought of to scale relational databases. There also may be some breakthroughs with NoSQL datbases that make them scale better than they do now.</p>
<p><a href="http://en.wikipedia.org/wiki/Relational_database">Relational databases</a> have been around since the 1970s and Oracle was started in 1977. There has been a ton of research poured into relational databases. We have the SQL-99 standard and ACID compliance. There are a lot of people using relational databases and they have become very familiar with them over the years. MongoDB was just started a little over <a href="http://en.wikipedia.org/wiki/MongoDB#History">5 years ago</a>. Not as many people have as much knowledge or expierence with MongoDB as they do with something like MySQL because it has not been around as long. To say that you will "never" use it again for a project moving forward is the wrong attitude. If you do not feel comfortable with MongoDB, then you should not use it for any sort of serious project. That should not prevent you from using it for toy projects so you can build more experience. Let's face it: document databases have a place in the world and they are not going to go away.</p>
<p>I think a lot of the problems Armin has with NoSQL, such as adding a new index, will go away. The fact that indices are hard to add or can change the output is a problem that can be solved. That is not a problem inherit with document databases. That is just a pain point of the implementation. The global lock in MongoDB was a huge pain point and it eventually went away. Maybe saying that you "would never start a project with MongoDB" is just sensationalism. I believe NoSQL can be used for toy projects if you think the technology is interesting but are not ready to use it on something more serious. That is a great way we can be responsible with approaching new technology. </p>
http://activitystrea.ms/schema/1.0/postPHP: The Good Parts2012-07-16T00:00:00+00:002012-07-16T00:00:00+00:00Herman J. Radtke IIIhttp://activitystrea.ms/schema/1.0/personhttps://hermanradtke.com/about/https://hermanradtke.com/2012/07/16/php-the-good-parts.html/<p>This blog post is inspired by Douglas Crockford's book <a href="http://www.amazon.com/JavaScript-Good-Parts-Douglas-Crockford/dp/0596517742">JavaScript: The Good Parts</a>.</p>
<p>All programming languages have warts. That is, there were certain decisions made about a language that are less than ideal. Some people are driven to remove these warts from the language in an attempt to make the language better. I think this is done with the best intentions, but can often have negative consequences. JavaScript is a good example of a language that has a lot of warts. Despite all these warts, JavaScript is a very useful language and has seen a huge rise in popularity. I feel PHP is the same way. It has bad parts, but there are so many good parts that we need to celebrate those good parts. Here are a few things off the top of my head:</p>
<h2>Arrays</h2>
I think the array is the single most powerful and useful part of PHP. The PHP array is the Swiss Amry knife in my programming toolkit. I have written applications in a number of other software languages and I have yet to find anything else more useful. The best part about arrays is that they just work. I don't have to decide ahead of time between a list or a map. The PHP array is to data structures as NoSQL is to SQL. Better still is that PHP core uses them all over the place. Results from the database: arrays. Parsing a json POST from the client: arrays. They are ubiquitous in PHP in both core and userland. I cannot say enough good things about PHP arrays.
<h2>Web Ready</h2>
PHP is web ready. I do not mean that PHP is easy to integrate into a webserver. PHP is easy to integrate, but I think a lot of languages do a good job of integrating to webservers now. I mean PHP is built for the web. It is so easy to create an HTML template and pass the data to it. I think Mustache and Twig are great. That being said, I do not have to decide on a templating language in order to get up and running. Everyone understands HTML.
<p>I do think this is feature is getting less important as the web develops. I write a lot of API's and send almost everything to the client via json. However, they are still tons of websites out there are that are not platforms and need to serve up HTML.</p>
<h2>Streams</h2>
Streams are the best kept secret in PHP. Most people do not even realize they are using streams when they are interacting with file systems or networks. I wrote a <a href="https://gist.github.com/1706840">plugin to push messages</a> to the Phergie IRC bot in less than an hour using streams. They are a really powerful abstraction that is used all over PHP.
<h2>Type Juggling</h2>
For the most part, a web application is just a bunch of strings. HTTP is all strings, most database adapters return strings and all output is strings. PHP handles all of this and removes all kinds of boilerplate code from my applications. I think PHP has the most sensible implementation of juggling too. Yes, they are some problems with large integers that are represented as strings. It is by no means perfect. However, I think the PHP zval has saved me orders of magnitude more hours than pain.
<p>For all the warts, there is plenty of beauty in PHP. I still enjoy writing web applications using PHP and focus my time using the parts of PHP that work really well.</p>
http://activitystrea.ms/schema/1.0/postManaging Gearman With Gearadmin Tool2012-04-23T00:00:00+00:002012-04-23T00:00:00+00:00Herman J. Radtke IIIhttp://activitystrea.ms/schema/1.0/personhttps://hermanradtke.com/about/https://hermanradtke.com/2012/04/23/managing-gearman-with-gearadmin-tool.html/<p>The more jobs flowing through Gearman, the more likely something will happen. Queues can get backed up, workers can crash and performance can degrade. It is important to monitor the status of the Gearman ecosystem and be proactive about fixing problems. We can do this using the <em>gearadmin</em> tool.<p>The more jobs flowing through Gearman, the more likely something will happen. Queues can get backed up, workers can crash and performance can degrade. It is important to monitor the status of the Gearman ecosystem and be proactive about fixing problems. We can do this using the <em>gearadmin</em> tool.<span id="continue-reading"></span>The <em>gearadmin</em> program is a relatively new program that makes administration of Gearman easier. Before the release of gearman 0.19 the only way to query the gearman daemon was to use telnet. You still can use telnet and reference the Administrative Protocol section <a href="http://gearman.org/?id=protocol">http://gearman.org/?id=protocol</a> for a list of commands. While telnet is a still an option, the <em>gearadmin</em> tool saves a lot of boilerplate scripts from being written. The <em>gearadmin</em> tool is really nothing more than a wrapper around the telnet commands. It does make capturing the output a little easier and you don't have to memorize the commands.</p>
<p>I wish the <em>gearadmin</em> output was a little nicer to read. The raw dump of the telnet output leaves the data a little cryptic.</p>
http://activitystrea.ms/schema/1.0/postEvaluate multidimensional arrays using vim xdebug2012-03-27T00:00:00+00:002012-03-27T00:00:00+00:00Herman J. Radtke IIIhttp://activitystrea.ms/schema/1.0/personhttps://hermanradtke.com/about/https://hermanradtke.com/2012/03/27/evaluate-multidimensional-arrays-using-vim-xdebug.html/<p>I love vim. I love XDebug. I need it to evaluate multidimensional (or nested) arrays though and it does not seem to do that. Chris Hartjes tweeted about his frustration with arrays too and I decided to fix the problem. Turns out there is nothing to fix.</p>
<p>Turns out we need to tweak a default configuration settings to get this all to work. Open up your .vimrc and add the following line:</p>
<p><code> let g:debuggerMaxDepth = 3
</code></p>
<p>That depth means you will be able to view the contents of a triply-nested array. That seemed like a sensible default to me. Now you can evaluate or get the property of any variable like normal.</p>
<p>Once the depth is set there is no way to change it during a debug session. You have to close the existing XDebug session, update the value and start a new session. I plan on changing this in a future release of the plugin.</p>
http://activitystrea.ms/schema/1.0/postUsing the Gearman Tool For Rapid Development of Clients and Workers2012-01-25T00:00:00+00:002012-01-25T00:00:00+00:00Herman J. Radtke IIIhttp://activitystrea.ms/schema/1.0/personhttps://hermanradtke.com/about/https://hermanradtke.com/2012/01/25/using-gearman-tool-for-rapid-development-of-clients-and-workers.html/<p>Gearman comes with a few tools that make development and testing easier. The <em>gearman</em> program creates boilerplate clients and workers. The <em>gearman</em> program comes default with the gearmand package. Do not confuse <em>gearman</em> with <em>gearmand</em>. The <em>gearmand</em> daemon is what manages the queue, clients and workers. The <em>gearman</em> program is a tool to quickly create simple clients and workers. The options for <em>gearman</em> can be slightly confusing, so I will go through a set of examples on how to use them.<p>Gearman comes with a few tools that make development and testing easier. The <em>gearman</em> program creates boilerplate clients and workers. The <em>gearman</em> program comes default with the gearmand package. Do not confuse <em>gearman</em> with <em>gearmand</em>. The <em>gearmand</em> daemon is what manages the queue, clients and workers. The <em>gearman</em> program is a tool to quickly create simple clients and workers. The options for <em>gearman</em> can be slightly confusing, so I will go through a set of examples on how to use them.<span id="continue-reading"></span></p>
<p>I find myself using the client functionality of the <em>gearman</em> tool most often. If I am tasked with creating or updating a gearman worker I want to test that the worker actually works. The code that sends a job to the worker is normally part of the web application and I don't want to dig through the application trying to figure out what I need to do send the job out. I could just create a simple php script that creates a client and sends the job over, but the <em>gearman</em> tool already does this.</p>
<p>Example of using <em>gearman </em>as a client:</p>
<script src="https://gist.github.com/1677655.js?file=client.sh"></script>
<p>I use the <em>gearman</em> program as a worker less often. It is still useful for creating a simple worker to test my client code against. I can write my client code without a fully functional worker if the client code is not expecting a complex response.</p>
<p>Example of using <em>gearman</em> as a worker:</p>
<script src="https://gist.github.com/1677655.js?file=worker.sh"></script>
<p>The <em>gearman</em> tool comes with the standard options for specifying a specific host and port. There are a number of other options that may be of use in specific circumstances. I encourage you to read them over by typing "gearman -H" on the command line.</p>
http://activitystrea.ms/schema/1.0/postRegistering Functions With Gearman Workers2012-01-24T00:00:00+00:002012-01-24T00:00:00+00:00Herman J. Radtke IIIhttp://activitystrea.ms/schema/1.0/personhttps://hermanradtke.com/about/https://hermanradtke.com/2012/01/24/registering-functions-with-gearman-workers.html/<p>The Gearman <a href="php.net/manual/en/gearman.examples.php" target="_blank">examples on php.net</a> are a great primer for groking how the Gearman client and worker interact with each other. One gripe I have is that the examples declare global functions for the worker to register. I feel this leads develpers down the wrong path. With PHP5.3, there is an easier solution though: anonymous functions.<p>The Gearman <a href="php.net/manual/en/gearman.examples.php" target="_blank">examples on php.net</a> are a great primer for groking how the Gearman client and worker interact with each other. One gripe I have is that the examples declare global functions for the worker to register. I feel this leads develpers down the wrong path. With PHP5.3, there is an easier solution though: anonymous functions.<span id="continue-reading"></span></p>
<p>Declaring a global functions in a gearman worker script may not seem like a big deal, but these things have a way of catching up to you. I personally ran into this when I suggested that HauteLook start using GearmanManager to manage the gearman workers. A side affect of this is that all gearman worker scripts now run in the same instance of PHP. There were a number of occasions when workers would fail to load on production because of the global naming conflicts.</p>
<p>I prefer to use register anonymous functions with my gearman workers. This keeps them out of the global scope and puts the logic right next to the worker registration call. This makes hard to test the logic inside the anonymous function, but I never test that logic. I treat the functions I register with gearman as controllers. I pass the workload off to a model class that is fully unit tested.</p>
<p>Here is a nice example:</p>
<script src="https://gist.github.com/1673522.js?file=worker.php"></script>
<p>I would update the Gearman examples on php.net, but I think a lot of people are still using PHP 5.2 (or even earlier). Providing multiple ways of registering the workers may confuse people too. Maybe next year.</p>