<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Selinux</title><link>https://jwheel.org/tags/selinux/</link><description>Homepage of Justin Wheeler, an Open Source contributor and Free Software advocate from Georgia, USA.</description><generator>Hugo -- gohugo.io</generator><language>en-us</language><managingEditor>Justin Wheeler</managingEditor><lastBuildDate>Mon, 13 Nov 2017 00:00:00 +0000</lastBuildDate><atom:link href="https://jwheel.org/rss/tags/selinux/index.xml" rel="self" type="application/rss+xml"/><item><title>ListenBrainz community gardening and user statistics</title><link>https://jwheel.org/blog/2017/11/listenbrainz-community-user-statistics/</link><pubDate>Mon, 13 Nov 2017 00:00:00 +0000</pubDate><guid>https://jwheel.org/blog/2017/11/listenbrainz-community-user-statistics/</guid><description><![CDATA[<p><em>This post is part of a series of posts where I contribute to the ListenBrainz project for my independent study at the Rochester Institute of Technology in the fall 2017 semester. For more posts, find them in <a href="https://jwfblog.wpenginepowered.com/tag/rit-2171/">this tag</a>.</em></p>
<hr>
<p>My progress with ListenBrainz slowed, but I am resuming the pace of contributing and advancing on my independent study timeline. This past week, I finished out assigned tasks to discuss contributor-related documentation, like a Code of Conduct, contributor guidelines, and a pull request template. I began research on user statistics and found some already created. I wrote one of my own, but need to learn more about Google BigQuery to advance further.</p>

<h2 id="paving-the-contributor-pathway">Paving the contributor pathway&nbsp;<a class="hanchor" href="#paving-the-contributor-pathway" aria-label="Anchor link for: Paving the contributor pathway">🔗</a></h2>
<p>
<figure>
  <img src="/blog/2017/11/Screenshot-from-2017-11-13-02-05-12.png" alt="Making it easier for people to contribute user statistics to ListenBrainz" loading="lazy">
  <figcaption>Making it easier for people to contribute to ListenBrainz with helpful contibuting guidelines</figcaption>
</figure>
</p>
<p>Earlier, I identified weaknesses for the ListenBrainz contributor pathway and found ways we could improve the pathway. This started with the development environment documentation. Now, I helped draft first revisions of our <a href="https://github.com/metabrainz/listenbrainz-server/pull/287">contributor guidelines</a>, <a href="https://github.com/metabrainz/listenbrainz-server/pull/286">Code of Conduct reference</a>, and <a href="https://github.com/metabrainz/listenbrainz-server/pull/288">pull request templates</a>. Together, these three documents have two goals.</p>
<ol>
<li><strong>Make it easier</strong> to contribute to ListenBrainz</li>
<li>Have a better experience and <strong>have fun</strong> contributing!</li>
</ol>
<p>Adding these documents addresses these goals. Additionally, the <a href="https://github.com/metabrainz/listenbrainz-server/community">GitHub community profile</a> also highlights these deliverables as ways to meet these goals. After getting feedback and seeing what others think, we make more revisions later (with some trial runs).</p>

<h2 id="back-to-selinux-context-flags">Back to SELinux context flags&nbsp;<a class="hanchor" href="#back-to-selinux-context-flags" aria-label="Anchor link for: Back to SELinux context flags">🔗</a></h2>
<p>Recently, I set my desktop back up and installed Docker for the first time on this machine; however, the development environment still failed to start. When I ran the script, it would eventually error out because of a permission denial. The web server image for ListenBrainz was failing.</p>
<p>After debugging, I noticed that I missed the SELinux volume tags for the ListenBrainz web server images in my original pull request, <a href="https://github.com/metabrainz/listenbrainz-server/pull/257">#257</a>. When I created the pull request, I might have had cached data that let my laptop run the development environment without a problem. In either case, it was an easy fix and I knew what the issue was when it happened. Therefore, I submitted a new fix in <a href="https://github.com/metabrainz/listenbrainz-server/pull/290">#290</a>.</p>

<h2 id="writing-new-user-statistics">Writing new user statistics&nbsp;<a class="hanchor" href="#writing-new-user-statistics" aria-label="Anchor link for: Writing new user statistics">🔗</a></h2>
<p>The most interesting part of my independent study is working with the music data to build and generate interesting statistics. I finally began exploring the <a href="https://github.com/metabrainz/listenbrainz-server/tree/master/listenbrainz/stats">existing statistics</a> in ListenBrainz. The statistic queries use BigQuery standard SQL. BigQuery helps rapidly scan and scale data queries to help with performance (I have a lot to learn about BigQuery).</p>

<h4 id="two-types-of-statistics">Two types of statistics&nbsp;<a class="hanchor" href="#two-types-of-statistics" aria-label="Anchor link for: Two types of statistics">🔗</a></h4>
<p>Additionally, ListenBrainz generates <strong>two types</strong> of statistics:</p>
<ol>
<li>Site-wide statistics</li>
<li>User statistics</li>
</ol>
<p>Site-wide statistics are metrics non-specific to a single user. There is only <a href="https://github.com/metabrainz/listenbrainz-server/blob/master/listenbrainz/stats/sitewide.py">one site-wide query</a> now. It counts how many artists were ever submitted to this ListenBrainz instance and returns an integer. There&rsquo;s room for expansion in site-wide statistics.</p>
<p>On the other hand, user statistics are metrics specific to a single user. There&rsquo;s a <a href="https://github.com/metabrainz/listenbrainz-server/blob/master/listenbrainz/stats/user.py">fair number already</a>, like the top artists and songs in a time period and the number of artists you&rsquo;ve listened to. These are a little more complete and offer more expansion for doing cool front-end work with something like <a href="https://d3js.org/">D3.js</a>.</p>

<h4 id="writing-user-statistics">Writing user statistics&nbsp;<a class="hanchor" href="#writing-user-statistics" aria-label="Anchor link for: Writing user statistics">🔗</a></h4>
<p>Of course, I had to try writing my own. One helpful query I thought of was getting a count of the songs you listened to over a time period (e.g. &ldquo;you listened to 500 songs this week!&rdquo;). I haven&rsquo;t tested it yet, but I have this in a local branch and hope to test it with real data soon.</p>
<pre tabindex="0"><code>def get_play_count(musicbrainz_id, time_interval=None): 
 
 filter_clause = &#34;&#34; 
 if time_interval: 
     filter_clause = &#34;AND listened_at &gt;=
     TIMESTAMP_SUB(CURRENT_TIME(), 
     INTERVAL {})&#34;.format(time_interval) 
 
 query = &#34;&#34;&#34;SELECT COUNT(release_msid) as listen_count 
            FROM {dataset_id}.{table_id} 
            WHERE user_name = @musicbrainz_id 
            {time_filter_clause} 
            LIMIT {limit} 
         &#34;&#34;&#34;.format( 
                 dataset_id=config.BIGQUERY_DATASET_ID, 
                 table_id=config.BIGQUERY_TABLE_ID, 
                 time_filter_clause=filter_clause, 
                 limit=config.STATS_ENTITY_LIMIT, 
            ) 
 
 parameters = [ 
     { 
         &#39;type&#39;: &#39;STRING&#39;, 
         &#39;name&#39;: &#39;musicbrainz_id&#39;, 
         &#39;value&#39;: musicbrainz_id 
     } 
 ] 
 
 return stats.run_query(query, parameters)
</code></pre>
<h2 id="researching-google-bigquery">Researching Google BigQuery&nbsp;<a class="hanchor" href="#researching-google-bigquery" aria-label="Anchor link for: Researching Google BigQuery">🔗</a></h2>
<p>My next steps for the independent study are researching <a href="https://cloud.google.com/bigquery/docs/">Google BigQuery</a>. After going through the existing statistics and understanding how ListenBrainz generates them, an understanding of Google BigQuery is essential to writing effective queries. When I become more comfortable with the tooling and how it works, I want to map out a plan of statistics to generate and measure.</p>
<p>Until then, the hacking continues! As always, keep the FOSS flag high…</p>]]></description></item></channel></rss>