Unlike #1, you can skip optional parameters when calling functions.
By using a record parameter.
Usually it’s a optional options as record
This lets you call a function with any combination of keyword arguments.
Merging Default Values With the User’s Record
With record parameters you can merge the user’s record with your your own default values. It lets the user pick and choose which defaults to override — compared to using nulls as an “all-or-nothing” approach. I’ll cover this in a future post.
Creating Your Own Custom Functions
Let’s write a wrapper that calls Text.Combine . It’s a simple example to see how the parameter types differ
Join_Nullable = (texts as list, separator as nullable text) =>
Text.Combine( texts, separator ),
Join_Optional = (texts as list, optional separator as text) =>
Text.Combine( texts, separator ),
Type 1: Using Nullable Parameters
Regular nullable parameters are still mandatory. Even if it’s null.
When you import columns with Table.TransformColumnTypes, the UI doesn’t include the culture parameter. Unless you choose “using locale”.
// it's declared as
Table.TransformColumnTypes(
table as table, typeTransformations as list,
optional culture as text) as table
// The GUI skips the culture parameter
= Table.TransformColumnTypes( Source,{ {"Date", type text}} )
// if you click on "locale"
= Table.TransformColumnTypes( Source,{ {"Date", type text}}, "en-GB" )
Type 3 Examples: Using Optional Records
// It's declared as
DateTime.FromText( text as nullable text, optional options as any) as nullable datetime
// without the optional parameter
= DateTime.FromText("2010-12-31T01:30:25")
// They pass 2 keyword-style arguments.
= DateTime.FromText(
"30 Dez 2010 02:04:50.369730",
[
Format = "dd MMM yyyy HH:mm:ss.ffffff",
Culture="de-DE"
] )
Why does DateTime.FromText use options as any and not options as record ? In this case it’s for backwards compatibility. If they were writing it from scratch today, it could have used a record. Older versions used a type text parameter.
Whitespace between function calls and the name are allowed. Including newlines These are equivalent statements:
= DoStuff( args )
// and
= DoStuff
(
args )
Record lookups also allow whitespace. This is totally valid syntax wise. Not necessarily morally though.
let
Func = () => [
user = [ Name = "bob" ]
]
in
Func
(
)
[
user
]
Misleading Comments inside Lists, Records, and Function Calls are allowed
Comments do not affect parsing or execution. Without syntax highlighting it looks like 4 is the final item in the list
let
num = List.Count( { 0
..
5 } & { 3, 4 /*
}) /* fake ending here here * /
without syntax highlighting, it looks like the func call ended
later, secretly do more
*/ ,99 } )
in num
Now it’s slightly easier
let
num = List.Count( { 0
..
5 } & { 3, 4 /*
}) /* fake ending here here * /
without syntax highlighting, it looks like the func call ended
later, secretly do more
*/ ,99 } )
in num
Have you written a report where dates import right. But they break when ran on another machine? If you import right, you don’t have to set cultures in your report settings.
If you pick Date and “Using Locale…” it will replace this
// replace this
= Table.TransformColumnTypes( Source,{ {"Date", type text}} )
// with this
= Table.TransformColumnTypes( Source,{ {"Date", type text}}, "en-GB" )
Now I can import a CSV that uses the culture “English UK” or “German” or others And it still works when it’s ran on an “English-US” environment
When is Culture / Locale used?
Every time you convert dates and numbers to text, culture is involved. When you convert from text to numeric values, culture is involved. If you don’t set the culture parameter, it falls back on the value Culture.Current . Mine uses “en-us”
Sample data to break things
Here’s a variety of dates to compare which breaks. They came from cultures using the named ShortDate “d” format string
Any value in Power Query can store info in a metadatarecord. “Datasources” often have extra info. Web.Contents exposes the full request url including the Query variables, request headers, HTTP Status codes, and more.
The core “trick” to making this work is calling Value.Metadata on the right step.
For Web.Contents and File.Contents you want the binary value they return, before converting the response
For Folder.Files and Folder.Contents you want the binary value in the column named [Content]
Splitting Web.Contents into steps
Lets call this Rest API to some JSON: https://httpbin.org/json If you add a query using the Web UI, it combines “download raw bytes” and “convert to json” as one step.
let
Source = Json.Document(Web.Contents("https://httpbin.org", [RelativePath = "/json"] ))
in
Source
To access the response, we need to split them up:
let
ResponseBytes = Web.Contents("https://httpbin.org", [RelativePath = "/json"] ),
Source = Json.Document( ResponseBytes )
in
Source
Reading the Metadata
First call Value.Metadata on our first step. It returns a regular record. You can drill down to copy important properties to the top level
[
BaseUrl = "https://httpbin.org",
RawBytes = Web.Contents( BaseUrl, [ RelativePath = "/json" ]), // type is: binary|error
ResponseMeta = Value.Metadata( RawBytes ), // type is: record
Json = Json.Document( RawBytes ), // type is: record/list/error
// Now lets grab important fields by drilling into the ResponseMeta record
StatusCode = ResponseMeta[Response.Status]?, // type is: text
// Content.Uri is a function that returns the full filepath for files,
// For web requests you get the full url including the value of the query string
FullRequestUrl = ResponseMeta[Content.Uri]() // type is: text
]
Decoding the response as raw text for debugging
Say your API returns HTML when you expect JSON Some APIs will return errors as HTML, so if something isn’t right it’s worth checking.
That causes Json.Document to fail. Because we split steps up — RawText continues to work. It’s a quick way to verify if it’s actually returning CSV / JSON / HTML / etc.
If your files aren’t using UTF8 you’ll need to include the encoding parameter. TextEncoding.Utf8 is a good default for the web
RawText = Text.FromBinary( RawBytes ), // type is: text/error
Tip: The UI might use Lines.FromBinary, which means you need to combine them to one string. Text.FromBinary saves you a step.
PowerQuery has metadata that you don’t normally see. For example, take the function Table.FromRecords
Create a new blank query, and set the value to a function’s name. When you reference a function without arguments or parenthesis, it displays documentation. ( It’s mostly the same as the online docs )
Where does this come from? A lot of it is generated by metadata on the function’s type itself.
Let’s start drilling down. Using the function Value.Type you can view the type’s definition
It doesn’t seem very useful at first. The data we want is from the metadata, the Ascribed types.
How do you document your own functions? Here’s a medium example. The function declared further below starts with this definition:
Text.ReplacePartialMatches_impl = (
source as text, mapping as table
) as text => ...
To create the type, you start your function type almost the exact same as the definition. Then wrap it inside a type function
Text.ReplacePartialMatches.Type = type function(
source as text,
mapping as table
) as text meta [ .. ]
Next you start adding records to the final type’s metadata. If you haven’t seen the meta operator, check out Ben’s series:
The part that comes after meta operator is a regular record
let
Text.ReplacePartialMatches = Value.ReplaceType( Text.ReplacePartialMatches_impl, Text.ReplacePartialMatches.Type ),
Text.ReplacePartialMatches.Type = type function(
source as text,
mapping as Table.Type
) as text meta [
Documentation.Name = "Text.ReplacePartialMatches",
Documentation.LongDescription = Text.Combine({
"Test strings for partial text matches. Replace the entire cell/value with the new replacement text.",
"",
"Mapping table requires two columns: <code>[Partial]</code> and <code>[New Value]</code> "
}, "<br>")
],
Text.ReplacePartialMatches_impl = ( source as text, mapping as table ) as text =>
// todo: performance, exit early on first replacement
let
mappingList = Table.ToRecords( mapping ),
result = List.Accumulate(
mappingList,
source,
(state, cur) =>
if Text.Contains( state, cur[Partial], Comparer.OrdinalIgnoreCase )
then cur[New Value] else state
)
in result
in
Text.ReplacePartialMatches
options is an record, when used this way it’s similar to Python’s kwargs
Merging two records with update the existing fields, adding new fields, if they do not yet exist
(in PowerQuery) the order of steps don’t change the final result (order of execution is the same) That’s why defaults after config works
Sometimes readability improves when placing large “blocks” like values_list out of order so the logic is on top
I replaced values *before* converting them to text so you have more control (before coercion )
Text.JoinSpecialValues_impl = (source as list, optional options as nullable record) as text =>
let
config = Record.Combine({defaults, options ?? []}),
defaults = [
Separator = "|",
UseSpecialSymbols = true
],
text_list = List.Transform( values_list, Text.From),
joined_string = Text.Combine( text_list, config[Separator] ),
values_list = if not config[UseSpecialSymbols] then source else
List.ReplaceMatchingItems( source,
{
// replace true null, and true empty strings (vs whitespace)
{ null, "␀"},
{ "#(cr,lf)", "#(240d)" }, // is #(2424)" }
{ "#(lf)", "" }, // is #(2424)" }
{"", "␠"}
} )
in
joined_string,
Powershell
Goto Everything
Goto /c/foo/bar
# go back
> Goto -Back
> Goto '-' # normal cd history works too
> Goto '+'
# Goto the world
$Profile | goto # go to string's path
Get-Item $PROFILE | goto # cd to the FileItem's path
gcm EditFunc | goto # jump to function declaration
gmo NameIt | Goto # go to module's folder
[CompletionResult] | goto # to docs for
# <https://docs.microsoft.com/en-us/dotnet/api/System.Management.Automation.CompletionResult>
# Open git repos in browser
goto git microsoft/powerquery-parser | goto
goto git@github.com:microsoft/powerquery-parser.git | goto
# goto the newest log
Get-ChildItem 'c:\root\manyLogs' -Recurse
| Sort LastWriteTime -desc -top 1 | goto
Is it a bug, or, is the extra y-axis padding working as intended in a situation where intended is not intended ( ie: column count orders of magnitude larger than the terminal’s column count )
# 1] highlight matches in red
# 2] preserves all lines
> history | grep --perl-regex --ignore-case --color=always 'less|$'
# 3] add paging
> history | grep --perl-regex --ignore-case --color=always 'less|$' | less --raw-control-chars
# 4] View a log
> less someLog
# 5 start on the last line of a log
> less +G someLog
Power Query Sugar for selecting distinct filters. If the condition is not true, then it throws an error. Name = "Orders" works because it results in a distinct value from the column Name It’s valid even though the final query is many records. It’s the “distinctness” of the filter that is required to be true
How to wrap long lines, *without* splitting words. The input is a very long string, with no newlines:
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Sed eu laoreet turpis. Curabitur lacinia, risus ut rhoncus mattis, turpis lorem iaculis justo, nec ultrices arcu erat vitae felis. Pellentesque vulputate efficitur scelerisque. Etiam bibendum dignissim mauris
Query
<a href="https://docs.microsoft.com/en-us/powerquery-m/list-accumulate">List.Accumulate</a> is an aggregate function. I’m using it to “sum” — to add strings together. If the current line plus the next word is longer than 80 characters, then insert a newline first.
To find the length of the current line, I only want the length after the very last newline. Occurrence.Last returns the last match, else -1 if nothing is found.
let
LoremIpsum = "Lorem ipsum dolor sit amet, consectetur adipiscing elit. Sed eu laoreet turpis. Curabitur lacinia, risus ut rhoncus mattis, turpis lorem iaculis justo, nec ultrices arcu erat vitae felis. Pellentesque vulputate efficitur scelerisque. Etiam bibendum dignissim mauris",
// calculate length of string *after* the rightmost newline
Text_LengthAfterNewline = (string as text) as number =>
let
posLastNewline = Text.PositionOf(string, "#(lf)", Occurrence.Last),
posOffset = if posLastNewline <> -1 then posLastNewline else 0,
deltaLen = Text.Length(string) - posOffset
in
deltaLen,
// word wraps text
Text_WordWrap = (string as text, max_width as number) as text =>
let
words = Text.Split(string, " "),
accum_result = List.Accumulate(
words, "",
(state as text, current as text) as text =>
let
len = Text_LengthAfterNewline(state) + Text.Length(current) + 1,
maybeNewline =
if len > max_width then "#(lf)" else "",
accum_string = Text.Combine({state & maybeNewline, current}, " ")
in
accum_string
)
in
accum_result,
wrapped_text = Text_WordWrap(LoremIpsum, 80)
in
wrapped_text
The final result is 80 characters wide or less
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Sed eu laoreet turpis.
Curabitur lacinia, risus ut rhoncus mattis, turpis lorem iaculis justo, nec
ultrices arcu erat vitae felis. Pellentesque vulputate efficitur scelerisque.
Etiam bibendum dignissim mauris
Validating lengths of each line
let
Source = #"wrap lines",
validate_lengths =
let
lines = Text.Split(Source, "#(lf)"),
lengths = List.Transform(
lines,
each [ String = _, Length = Text.Length(_) ])
in
Table.FromRecords(
lengths,
type table[String = text, Length = number],
MissingField.Error )
in
validate_lengths
Check out the new tutorial at https://blog.powp.co/my-power-query-web-contents-cheat-sheet-6a5bbfdce5eb
Web.Contents without Refresh Errors
The main cause of Web.Contents not refreshing can be fixed by adding the options[Query] and options[RelativeaPath] parameters. (The UI doesn’t create them for you)
/* web request, act based on the HTTP Status Code returned
see more:
override default error handling: https://docs.microsoft.com/en-us/power-query/handlingstatuscodes
example wait-for loop: https://docs.microsoft.com/en-us/power-query/waitretry#manualstatushandling
*/
let
WikiRequest = (pageName as text) as any =>
let
BaseUrl = "https://en.wikipedia.org/wiki",
Options = [
RelativePath = pageName,
ManualStatusHandling = {400, 404}
],
// wrap 'Response' in 'Binary.Buffer' if you are using it multiple times
response = Web.Contents(BaseUrl, Options),
buffered = Binary.Buffer(response),
response_metadata = Value.Metadata(response),
status_code = response_metadata[Response.Status],
final_result = [
buffered = buffered,
response_metadata = response_metadata
]
in
final_result,
Queries = {"Cat", "DoesNot Exist fake page"},
Items = List.Transform(
Queries,
each WikiRequest( _ )
),
ResponseTable = Table.FromRecords(
Items,
type table[buffered = binary, response_metadata = record], MissingField.Error
),
#"Expanded HTTP Status Codes" = Table.ExpandRecordColumn(ResponseTable, "response_metadata", {"Response.Status"}, {"Response.Status"}),
#"Changed Type" = Table.TransformColumnTypes(#"Expanded HTTP Status Codes",{{"Response.Status", Int64.Type}})
in
#"Changed Type"
WebRequest: Wrapper with Better Defaults
You can get the full file with extra comments: WebRequest.pq
let
/*
Example using this url:
(https://www.metaweather.com/api/location/search?lattlong=36.96,-122.02)
WebRequest(
"https://www.metaweather.com",
"api/location/search",
[ lattlong = "36.96,-122.02" ]
)
Details on preventing "Refresh Errors", using 'Query' and 'RelativePath':
- Not using Query and Relative path cause refresh errors:
(https://blog.crossjoin.co.uk/2016/08/23/web-contents-m-functions-and-dataset-refresh-errors-in-power-bi/)
- You can opt-in to Skip-Test:
(https://blog.crossjoin.co.uk/2019/04/25/skip-test-connection-power-bi-refresh-failures/)
- Debugging and tracing the HTTP requests
(https://blog.crossjoin.co.uk/2019/11/17/troubleshooting-web-service-refresh-problems-in-power-bi-with-the-power-query-diagnostics-feature/)
*/
WebRequest = (
staticPath as text, // domain
relativePath as text, // basically use everything after ".com" to "?"
optional query as nullable record, // url query string
optional asRaw as nullable logical, // use true if content is not Json
optional headers as nullable record // optional HTTP headers as a record
) as any =>
let
query = query ?? [],
asRaw = asRaw ?? false, // toggles calling Json.Document() or not
headers = headers ?? [
Accept="application/json"
],
baseUrl = staticPath,
options = [
RelativePath = relativePath,
Headers = headers,
Query = query
// optionally toggle handling errors for specific HTTP Status codes
// ManualStatusHandling = {400, 404}
],
// wrap 'Response' in 'Binary.Buffer' if you are using it multiple times
response = Web.Contents(staticPath, options),
metadata = Value.Metadata(response),
buffered = Binary.Buffer(response),
result = Json.Document(buffered)
in
[
response = if asRaw then buffered else result,
status_code = metadata[Response.Status],
metadata = metadata
]
in
WebRequest
Chaining Web.Contents to Merge Many Queries
let
response_locations = WebRequest(
"https://www.metaweather.com",
"api/location/search",
[ lattlong = "36.96,-122.02" ]
),
location_schema = type table[
distance = number, title = text,
location_type = text, woeid = number, latt_long = text
],
cityListing = Table.FromRecords(response_locations[response], location_schema, MissingField.Error),
city_mergedRequest = Table.AddColumn(
cityListing,
"LocationWeather",
(row as record) as any =>
let
woeid = Text.From(row[woeid]),
response = WebRequest(
"https://www.metaweather.com",
"api/location/" & woeid,
[]
)
in
response,
type any
)
in
city_mergedRequest
Html.Table – Parsing with CSS Selectors
Select a Single Element: an Image, url, text, etc…
This fetches the current heading text on the blog
let
Url = "https://powerbi.microsoft.com/en-us/blog/",
Response = Web.Contents( Url ),
/*
note: normally do not pass dynamic urls like this, see cheatsheet on preventing refresh errors
Non-tabular scraping like Images or any single elements, does not use a "RowSelector"
This CSS Selector finds exactly one element, the Page's Header Text
.section-featured-post .text-heading1 a
*/
HeaderText = Html.Table(
Response,
{
{ "Page Header", ".section-featured-post .text-heading1 a" }
}
)
in
HeaderText
Parsing Element’s Attributes
let
Url = "https://powerbi.microsoft.com/en-us/blog/",
Response = Web.Contents( Url ),
/*
The 3rd argument in "columnNameSelectorPairs" is the transformation function.
by default it uses:
each _[TextContent]
*/
HeaderAsElement = Html.Table(
Response,
{ { "Link", ".section-featured-post .text-heading1 a", each _ } }
),
ElementRecord = Table.ExpandRecordColumn(
HeaderAsElement, "Link",
{"TagName", "TextContent", "Attributes"}, {"TagName", "TextContent", "Attributes"}
),
ExpandedAttributes = Table.ExpandRecordColumn(
ElementRecord, "Attributes",
{"href", "rel", "title"}, {"attr.href", "attr.rel", "attr.title"}
)
in
ExpandedAttributes
Select Tables using your own CSS Selectors
Results of the RowSelectorResults of columnNameSelectorPairs
// Docs on Enum
let
Source = "https://docs.microsoft.com/en-us/previous-versions/dynamics/ax-2012/reference/gg841505(v=ax.60)",
// note: normally do not pass dynamic urls like this, see cheatsheet on preventing refresh errors
Response = Web.BrowserContents( Source ),
/*
Think of "RowSelector" as selecting a table row
Then for every row, you select "columns" using the "columnNameSelectorPairs" selector
The combination gives you a table cell.
For more on CSS Selectors, see: <https://developer.mozilla.org/en-US/docs/Web/CSS/CSS_Selectors>
*/
columnNameSelectorPairs = {
// column names don't matter here, since I'm using .PromoteHeaders
{ "Column1", "TABLE.table > * > TR > :nth-child(1)" },
{ "Column2", "TABLE.table > * > TR > :nth-child(2)" },
{ "Column3", "TABLE.table > * > TR > :nth-child(3)" }
},
t1 = Html.Table(
Response, columnNameSelectorPairs,
[RowSelector = "TABLE.table > * > TR"]
),
t2 = Table.PromoteHeaders( t1, [PromoteAllScalars = true] ),
FinalTable = Table.TransformColumnTypes(
t2,
{ { "Name", type text }, { "Value", Int64.Type }, { "Description", type text} }
)
in
FinalTable
SQL Native Query
Parameterized SQL queries with Value.NativeQuery()
let
Source = Sql.Database("localhost", "Adventure Works DW"),
Test = Value.NativeQuery(
Source,
"SELECT * FROM DimDate
WHERE EnglishMonthName=@MonthName",
[
MonthName = "March",
DayName = "Tuesday"
]
)
in
Test
Other
Cartesian Product
let
#"Add Column Pair2" = Table.AddColumn(
Table_Pairs1, "Pairs2",
each Table_Pairs2,
Table.Type
),
#"Expanded Pairs" = Table.ExpandTableColumn(
#"Add Column Pair2",
"Pairs2",
{"Color", "Property"},
{"Color", "Property"}
)
in
#"Expanded Pairs"
You can make start with ‘enter data’, then duplicate that line. You end up with two steps, each are their own enter data
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WcitKzEvOLFbSUfLJrEosSgEyTJRidaKVHHNSKxJTijITgSLuqcnZ+UDaEiwTkgpS5ZtfkgGkjJRiYwE=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Name = _t, Species = _t, Age = _t]),
Source2 = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45W8krNU9JRciotKUktSsupBLINlWJ1opWCE4sSM4A858QSIGlkDBYMSU0Bclzy04GkiYFSbCwA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Name = _t, Species = _t, Age = _t]),
Final = Table.Combine( { Source, Source2 } )
in
Final
Convert Table to JSON
TableToJson = (source as table, optional encoding as nullable number) as text =>
let
encoding = encoding ?? TextEncoding.Utf8,
bin = Json.FromValue(source, encoding),
jsonAsText = Text.FromBinary(bin, encoding)
in
jsonAsText
let
// Converts a list of any type to text. Works well on most types
// although to support all cases, it requires more logic
mixedList = {4, "cat", #date(1999,5,9), 0.4},
ListAsText = List.Transform(mixedList, each Text.From(_)),
CsvText = Text.Combine( ListAsText, ", ")
in
//output: "4, cat, 5/9/1999, 0.4"
CsvText
Functions
Using Optional Parameters with Default Values
The Null_coalescing_operator ?? simplifies default values. encoding will be set to what the user passed, unless it’s null. In that case, it is set to TextEncoding.Utf8
let
Read_Json = (json as any, encoding as nullable number) as any =>
// calls Json.Document, using UTF8 by default
let
encoding = encoding ?? TextEncoding.Utf8,
result = Json.Document(json, encoding)
in
result
in
Read_Json
Mapping Function Calls based on Type
Caller chooses which type of conversioin to use, based on type names { date, datetime, datetimezone }
let
// 1] get a `type` from caller
// 2] return a difference function based on the type
GetTransformByType = (_type as type) as function =>
let
// originally from: <https://docs.microsoft.com/en-us/power-query/helperfunctions#tablechangetype>
func_transform =
if (Type.Is(_type, type date)) then Date.From
else if (Type.Is(_type, type datetime)) then DateTime.From
else if (Type.Is(_type, type datetimezone)) then DateTimeZone.From
else (t) => t // else return self
in
func_transform,
nowDtz = DateTimeZone.LocalNow(),
// invoke in 2 steps
toDate = GetTransformByType(type date),
callDate = toDate( DateTimeZone.FixedLocalNow() ),
// create, and invoke functions
Results = [
asDate = (GetTransformByType(type date))( nowDtz ),
asDateTime = (GetTransformByType(type datetime))( nowDtz ),
asDateTimeZone = (GetTransformByType(type datetimezone))( nowDtz )
]
in
Results
Power BI Discord asked the question: > How do you use duplicate keys, and dynamic urls with <a href="https://docs.microsoft.com/en-us/powerquery-m/web-contents">Web.Contents</a> ? (They were using a web API that required duplicate keys) After I wrote this, Chris Webb found an easier solution.
Requirements
You can’t use a regular record because keys must be distinct. Query = [ Key1 = 1, Key1 = 10] will throw an error.
You can’t put the **dynamic url** in the first argument Web.Contents or else refreshes can break
let
QueryStr_UsingDuplicateKeys = (key as text, values as list) as text =>
// values are the 'value' of 'key'-> 'value' pairs
let
escapedList = List.Transform(
values,
each
key & "=" & Uri.EscapeDataString( Text.From(_) )
),
joinedArgs = Text.Combine(escapedList, "&")
in
joinedArgs
in
QueryStr_UsingDuplicateKeys
Note: BaseUrl is for the static part of the url. Everything else should be in options[RelativePath] or options[Query] See docs: Web.Contents for details.
Which results in the url: <a href="http://jsonplaceholder.typicode.com/comments">http://jsonplaceholder.typicode.com/comments</a> This is a good place to use Power Query‘s null coalesce operator (Which isn’t in the official docs)