Categories
Power BI Power Query

Importing text files in Power BI

What do you do when text files are not formatted as .csv or .json ? If they repeat a pattern, you can automate it.

( You can download the final .PBIX report here )

This file always uses:

  • 3 values for for each “table” row
  • One line per value – which becomes our “columns”
  • every “column” line has a : in it. If you skip all lines without a : it filters all unwanted lines
Title
---------------------
DeviceID: One
Info1: One:Info1
Info2: One:Info2

---------------------

DeviceID: Two
Info1: Two: Info1
Info2: Two: Info2

---------------------

DeviceID: DeviceID
Info1: Three: Info1
Info2: Three: Info2

First take a look at the final function. Then I’ll explain how it works.

The main query calls a custom function

ConvertTableFromTextLines( "c:\data\sample.txt", ":", 3 )

All the work is one here

let  ConvertTableFromText = (filepath as text, splitCharacter as text, linesPerRecord as number, optional encoding as nullable number) as table =>
    let
        TextLines = Lines.FromBinary(
            File.Contents( filepath, null ),
            null,
            null,
            encoding ?? TextEncoding.Utf8
        ),
        TextLineTable = Table.FromColumns(
            { TextLines }, {"Line"}
        ),

        Pairs = Table.SelectRows(
            TextLineTable,
            each Text.Contains( [Line], splitCharacter )
        ),

        SingleRecordAsCols = Table.SplitColumn(
            Pairs,
            "Line",
            Splitter.SplitTextByEachDelimiter(
                { splitCharacter }, QuoteStyle.Csv, false
            ),
            {"Name", "Value"}
        ),

        TotalLines = Table.RowCount( SingleRecordAsCols ) ,
        NumberOfGroups = TotalLines / linesPerRecord,

        ListOfRecords = List.Generate(
            () => 0,
            each _ < NumberOfGroups,
            each _ + 1,
            each (
                Record.FromTable(
                    Table.Range(
                        SingleRecordAsCols,
                        _ * linesPerRecord,
                        linesPerRecord
                    )
                )
            )
        ),
        t = Table.FromList(
            ListOfRecords, Splitter.SplitByNothing(), null, null, ExtraValues.Error
        ),

        columnNameList = Record.FieldNames( ListOfRecords{0} ),

        TableOfRecords = Table.ExpandRecordColumn(
            t, "Column1", columnNameList, columnNameList

        )
    in
        TableOfRecords
in
    ConvertTableFromText

First we load the text file as a list. File.Contents( filepath ) reads all file types. Lines.FromBinary( Contents ) converts it into text. Depending on how the file was saved, you may need to set the encoding

Don’t worry if encoding sounds scary. If text loads with the wrong characters, try the other one. Two good ones to try first are:

TextEncoding.WindowsPower BI defaults to this for windows files
TextEncoding.Utf8Anything on the internet uses UTF8 by default.
Tip: If a file truly is ASCII then UTF8 will work

Next remove any lines that are missing :

Then split each line into two columns:
Transform Column -> Text -> Split -> By Delimiter

I started with these options

I renamed them to Name and Value because they will be converted into records.

To convert multiple lines into a single recordI used List.Generate. You can think of it like a for loop

There are

  • 3 lines per record
  • 9 total lines = the number of lines in the file
  • 3 total groups = 9 total lines / 3 lines per record

We need to loop 3 times. I’ll recreate a loop like this, converted to Power Query

for( $i = 0 ; $i < NumberOfGroups; $i += 1){
    CreateRecord()
}

List.Generate() has 4 arguments. Each one is a function. It uses:

  • initialize sets your starting condition. ( set i to 0 )
  • condition tests if we’ve reached the total number of groups
  • next changes the value of i. We want to add 1
    optional selector function creates the value every loop. This calls CreateRecord() once for every loop
    ListOfRecords = List.Generate(
        () => 0,
        each _ < NumberOfGroups,
        each _ + 1,
        each (
            Record.FromTable( 
                Table.Range(
                    SingleRecordAsCols,
                    _ * linesPerRecord,
                    linesPerRecord
                )
            )
        )
    ),

The next function first grabs rows from the table SingleRecordAsCols. It converts those into a single record

Every loop 3 is added to the offset.

i valueoffsetcount
00 * 3 = 03
11 * 3 = 33
22 * 3 = 63

You end up with a list of records

Choose convert to table

Finally expand records to columns

The final result: