<?xml version="1.0" encoding="utf-8" standalone="yes" ?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>GitHub on Tony McGovern. Hacker. Artist. Storyteller. Data lover.</title>
    <link>/categories/github/index.xml</link>
    <description>Recent content in GitHub on Tony McGovern. Hacker. Artist. Storyteller. Data lover.</description>
    <generator>Hugo -- gohugo.io</generator>
    <language>en-us</language>
    <atom:link href="/categories/github/index.xml" rel="self" type="application/rss+xml" />
    
    <item>
      <title>Power Query M Version Control using GitHub</title>
      <link>/blog/power-query-version-control/</link>
      <pubDate>Mon, 12 Nov 2018 00:00:00 +0000</pubDate>
      
      <guid>/blog/power-query-version-control/</guid>
      <description>

&lt;h2 id=&#34;power-query-m-version-control-using-github&#34;&gt;Power Query M Version Control using GitHub&lt;/h2&gt;

&lt;p&gt;There is currently no obvious version control process to see a history of changes made to Power Query M functions within Excel or Power BI. The Advanced Editor in the Power Query Editor has no built-in function to capture changes made to the query. In other words, it&amp;rsquo;s very difficult to roll back queries to earlier versions once they&amp;rsquo;re executed.&lt;/p&gt;

&lt;p&gt;I tend to have a fairly involved data preparation process that I build in to my Power Query M functions. It&amp;rsquo;s not uncommon for my queries to &lt;a href=&#34;https://github.com/tonmcg/powersocrata/blob/master/M/Socrata.ReadData.pq&#34;&gt;contain hundreds of lines&lt;/a&gt;. Once they become this large, managing changes to my queries starts to become onerous. It&amp;rsquo;s at this point I often switch from writing Power Query M functions in the Advanced Editor to writing them in a text editor like Notepad++. One of the benefits of separating the functions from the Power BI or Excel file is that I can store the functions in a .pq file on GitHub and &lt;a href=&#34;https://github.com/tonmcg/powersocrata/commit/52fd8905c9cbe4d1c7143dd61264af0c6f7b3a50#diff-64eaca6fc062ee30a8d65f09d2beb4aa&#34;&gt;see the entire timeline of changes made to the file&lt;/a&gt;. Storing .pq files on GitHub becomes especially &lt;a href=&#34;https://guides.github.com/introduction/git-handbook/&#34;&gt;useful in collaborative envrionments&lt;/a&gt; where team members work on the same set of Power Query M functions. And even if you&amp;rsquo;re working by yourself, GitHub is still useful for those projects that contain a number of queries, like this library of M functions that &lt;a href=&#34;https://github.com/tonmcg/powersocrata&#34;&gt;prepare data from the Socrata Open Data API&lt;/a&gt;, for example.&lt;/p&gt;

&lt;p&gt;So if I&amp;rsquo;m not writing these functions in the Advanced Editor, how does the Power Query Editor know they exist? How do we execute Power Query M functions that reside somewhere else?&lt;/p&gt;

&lt;h3 id=&#34;expression-evaluate-and-the-shared-environment&#34;&gt;Expression.Evaluate() and the &lt;code&gt;#shared&lt;/code&gt; Environment&lt;/h3&gt;

&lt;p&gt;Consider the Power Query M code below:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;let
  Source = 
    Expression.Evaluate(
      Text.FromBinary(
        Web.Contents(
          &amp;quot;https://raw.githubusercontent.com/tonmcg/powersocrata/master/M/Socrata.ReadData.pq&amp;quot;
        )
      ),
      #shared
    )
in
  Source
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;If we copy the code into the Advanced Editor, we should see the following function appear:
&lt;img src=&#34;../../img/main/Excel_GitHub_Query.png&#34; alt=&#34;Excel&#34; /&gt;&lt;/p&gt;

&lt;p&gt;The screenshot shows the Power Query Editor rendering a function with multiple parameters. Not anywhere in the code above have we defined parameters. So where did they come from? The answer is I created a custom Power Query M function called &lt;code&gt;Socrata.ReadData()&lt;/code&gt; and stored it in a repository on GitHub. I&amp;rsquo;m using the &lt;code&gt;Web.Contents()&lt;/code&gt; function to return the &lt;a href=&#34;https://raw.githubusercontent.com/tonmcg/powersocrata/master/M/Socrata.ReadData.pq&#34;&gt;contents of the file&lt;/a&gt; and then the &lt;code&gt;Text.FromBinary()&lt;/code&gt; function to render the text contained within. The parameters are defined within the function itself, which again, is stored on GitHub.&lt;/p&gt;

&lt;p&gt;The most interesting part of the function is the use of &lt;code&gt;Expression.Evaluate()&lt;/code&gt;. At its simplest, it &lt;a href=&#34;https://docs.microsoft.com/en-us/powerquery-m/expression-evaluate&#34;&gt;evaluates text and returns an evaluated value&lt;/a&gt;. In our case, the evaluated text is the text within the &lt;code&gt;Socrata.ReadData()&lt;/code&gt; function. The evaluated value is the actual function itself. Put another way, &lt;code&gt;Expression.Evaluate()&lt;/code&gt; allows us to make a call to GitHub and return the function&amp;rsquo;s contents, render the text of the function, and tell the Power Query engine to treat the rendered text as an M function.&lt;/p&gt;

&lt;p&gt;But that&amp;rsquo;s only half of the answer. While the first parameter of the &lt;code&gt;Expression.Evaluate()&lt;/code&gt; function requires an expression as text, the second parameter asks for the &lt;em&gt;environment&lt;/em&gt; in which the text resides. In our example, I&amp;rsquo;ve defined the &lt;em&gt;envrionment&lt;/em&gt; as the &lt;code&gt;#shared&lt;/code&gt; variable. This pattern is what allows a Power Query M function that resides on GitHub to be executed in our Power Query Editor.&lt;/p&gt;

&lt;p&gt;I&amp;rsquo;ll not go into it here, but if you&amp;rsquo;re interested in understanding the &lt;em&gt;environment&lt;/em&gt; parameter and the &lt;code&gt;#shared&lt;/code&gt; variable, here is a list of resources that should get you started:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Lars Schreiber and Imke Feldmann have done &lt;a href=&#34;https://ssbi-blog.de/technical-topics-english/the-environment-concept-in-m-for-power-query-and-power-bi-desktop-part-3/&#34;&gt;incredible work simplifying these concepts&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Chris Webb provides simple examples that help &lt;a href=&#34;https://blog.crossjoin.co.uk/2015/02/06/expression-evaluate-in-power-querym/&#34;&gt;motivate understanding in this area&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Finally, Imke Feldmann uses this pattern often and provides numerous examples on her blog. Here&amp;rsquo;s &lt;a href=&#34;https://www.thebiccountant.com/2018/05/17/automatically-create-function-record-for-expression-evaluate-in-power-bi-and-power-query/&#34;&gt;one example&lt;/a&gt;.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3 id=&#34;making-it-work-in-power-bi-dataflows&#34;&gt;Making it Work in Power BI Dataflows&lt;/h3&gt;

&lt;p&gt;The most exciting part, however, is that this pattern works not only in Excel and PowerBI&lt;sup id=&#34;a1&#34;&gt;&lt;a href=&#34;#warning&#34;&gt;[1]&lt;/a&gt;&lt;/sup&gt;, but through the browser as well &lt;a href=&#34;https://www.tonymcgovern.com/blog/power-bi-dataflows/&#34;&gt;with Power BI Dataflows&lt;/a&gt;. We can get the &lt;code&gt;Socrata.ReadData()&lt;/code&gt; function to work in dataflows by using the Power Query M code above in the Power BI dataflows Query Editor:
&lt;img src=&#34;../../img/main/Dataflows_GitHub_Query.png&#34; alt=&#34;Dataflows&#34; /&gt;&lt;/p&gt;

&lt;p&gt;If you want to see what this looks like in action, I &lt;a href=&#34;https://twitter.com/tonmcg/status/1060617265501126656&#34;&gt;tweeted a GIF of this working&lt;/a&gt; in Power BI dataflows.&lt;/p&gt;

&lt;p&gt;There are boundless opportunities here to use GitHub&amp;rsquo;s version control system to manage the entire timeline of changes made to projects that use Power Query M functions. Distributed teams around the world can simultaneously work on the code, understand what changes have been made, and collaborate at any time while maintaining source code integrity.&lt;/p&gt;

&lt;p&gt;Do you use this pattern as well? What else have you found? Let me know in the comments section below.&lt;/p&gt;

&lt;h3 id=&#34;update-13-november-2018&#34;&gt;Update 13 November 2018&lt;/h3&gt;

&lt;p&gt;Matt Masson offers a comment in response to this blog post:
&lt;blockquote class=&#34;twitter-tweet&#34; data-lang=&#34;en&#34;&gt;&lt;p lang=&#34;en&#34; dir=&#34;ltr&#34;&gt;Well the gateway infrastructure doesn&amp;#39;t support dynamic data sources, and I&amp;#39;m not sure if / when that will happen. I&amp;#39;m also not sure it is wise to ever pull dynamic code from web locations.&lt;/p&gt;&amp;mdash; Matt Masson (@mattmasson) &lt;a href=&#34;https://twitter.com/mattmasson/status/1062066012298842118?ref_src=twsrc%5Etfw&#34;&gt;November 12, 2018&lt;/a&gt;&lt;/blockquote&gt;
&lt;script async src=&#34;https://platform.twitter.com/widgets.js&#34; charset=&#34;utf-8&#34;&gt;&lt;/script&gt;&lt;/p&gt;

&lt;p&gt;I think Matt is suggesting this pattern of storing M code on a separate server could open up your application or dataflow to an exploit. If so, he&amp;rsquo;s absolutely correct. The code above is stored on a public GitHub repository, which means anyone can submit a pull request and sereptitiously inject code that can cause a security vulnerability.&lt;/p&gt;

&lt;p&gt;This solution is far from ideal. But the question remains: is there a secure version control system that allows me to see changes to my M code over time? If you have ideas, feel free to leave them in the comments below.&lt;/p&gt;

&lt;ol&gt;&lt;li id=&#34;warning&#34;&gt;&lt;b&gt;Be aware&lt;/b&gt;: the use of the &lt;code&gt;#shared&lt;/code&gt; variable within the &lt;code&gt;Expression.Evaluate()&lt;/code&gt; function is not currently a supported feature, which means it can &lt;a href=&#34;https://ssbi-blog.de/technical-topics-english/the-environment-concept-in-m-for-power-query-and-power-bi-desktop-part-3/#comment-134&#34;&gt; go away at any time&lt;/a&gt;.&lt;/li&gt;&lt;/ol&gt;
</description>
    </item>
    
  </channel>
</rss>