Integrating Workspace ONE reporting with PowerBI

Last week one of my customers asked if Workspace ONE data could be used in PowerBI. Since Workspace ONE Intelligence Reports are included in every WS1 edition and are accessible via API – this is the easiest way to expose data to external systems. But how exactly? Even though I’ve never used PowerBI before, after a few days I could connect the two systems. In this post I will demonstrate the solution and highlight some quirks of both Intelligence and PowerBI. Get strapped in – this is going to be a bumpy road!

Everything below is done using the "free tier" of Workspace ONE Intelligence - no add-on licenses required!

Plan of action:

  1. Brief overview of Workspace ONE Intelligence (onwards: WS1I) Reports API, resources, some nuances.
  2. How all this applies to our use case
  3. Setting up the WS1 Intelligence for API integration
  4. Setting up PowerBI (PBI) to interact with WS1 Intelligence.
    1. Setting up the workspace
    2. Writing PowerQuery M (PQM) support functions to authenticate and pull data
    3. Writing queries that actually pull data
    4. Publishing all that to web service to enable scheduled cloud-to-cloud data pull
    5. Some notes and impressions of PQM when it comes to working with arbitrary web APIs, caveats and extremely useful resources I wish I found a day earlier πŸ™‚
  5. Summary and wrap up.

Brief overview of WS1 Intelligence Reports API

In a nutshell, anything you can see in the in WS1I reports you can access through the WS1I Repors API. Additionally, you can use the API to create, modify, run, share reports etc. Thus, theoretically, full report lifecycle is possible via API – if your system suddenly decides that it needs another column on data – no need to log into the WS1I console to manually edit anything. Cool, isn’t it?!

The full workflow would thus look like this:

  • Connect/authenticate (via OAuth2) – this yields you a JWT bearer token that you need to specify in every subsequent request.
  • Create a report with your columns or list/search through existing ones – the end result is that you have a Report ID. I will later show you how to look up report ID for existing reports in the WS1I web console.
  • With this ID you can
    • Modify the report
    • Load report preview (returns first 1000 rows as JSON)
    • Run the report once or schedule it for periodic execution
    • List available downloads (actually, report generation jobs) for the report – this is the one we need to pull the actual data!
    • Do other things like sharing / viewing recipients etc that do not matter as much. I will link the full API guide at the end of the post, where you can see all possible options.
  • Normally, to pull data we would list the available downloads and choose the latest one that has status COMPLETED – this will yield a Download ID.
  • With this Download ID we can finally pull the actual report data (returns CSV).
  • Process the CSV in any way you like.

Please note that the report itself does not contain any consumable data (aside from preview that is limited to 1000 rows). The actual data is contained in the Download. Thus the report must have ran at least once (either via a manual run / API call, or via a schedule) and a download with a COMPLETED status must exist. The download list by default is sorted by time (newest on top), so choosing the first available completed download yields us the newest dataset.

Two important notes on Preview vs Download! I've seen people falling into these traps consistently when discussing the Reports API.

1. Report Preview returns JSON, whereas report Download returns CSV. I understand why CSV was chosen for full data reporting (1. a LOT less overhead compared to JSON, 2. easier to import into Excel), but I honestly do not understand why preview does not return CSV as well. 

2. Preview is limited to 1000 rows - do not use it to pull actual data!

Another important note is that in order to access the Reports API you need to create a Service Account in WS1 Intelligence and the report you want to access must be shared with the service account. I will show you later how to do it.

This concludes the brief overview. For full overview, here are two useful resources:

Now, let’s adapt it to our use case.

Implementing our use case

So, the API is great and powerful, but we don’t need all of that for our use case. Here is the flow that I have implemented:

  • In WS1 Intelligence
    • Manually create a report in WS1 Intelligence
    • Manually note the report ID (copy-paste)
    • Manually share the report with the PowerBI service account (that I have created in advance)
    • Manually schedule the report to run periodically
  • In PowerBI
    • Authenticate with service account credentials
    • Given the report ID, pull the list of downloads and choose the topmost one with COMPLETED status
    • Import CSV
    • Do whatever we want with it (not covered)
    • Upload everything to PowerBI cloud and set up auto-refresh so that PowerBI pulls new downloads regularly (ensure PowerBI schedule plays well with WS1I schedule!)

Looks a lot easier, doesn’t it? Let’s move on to setting up WS1I.

Setting up WS1 Intelligence for API integration

First, we need to create a service account. This is done in Intelligence console in Settings -> Service Accounts. The process is straightforward and results in account creation and a JSON file with all the details. Keep in mind that you will not be able to look up the client secret later on! This is why Intelligence gives you a JSON file to download with all the data. πŸ™‚

This is how a created service account looks like. Mine is called “reports”.
This is how a downloaded JSON looks like

Now, let’s get the report sorted. Using the picture below as a reference, here is what we need to do:

  1. Note down the Report ID from the URL (see the highlighted part). This will save us a good chunk of PowerQuery coding.
  2. Using the Share button ensure that the report is shared to the service account (choose view only permissions), otherwise you will not be able to see it via the API. This is actually a very powerful feature to reduce the attack surface!
  3. Using the Run button run the report at least once to ensure we have a dataset to download later on.
  4. Using the Schedules tab create a schedule so that new datasets would be generated automatically (and we can do even less API coding in PowerQuery).
All you need to do with the report is here

We’re all set on the Intelligence side and we are ready to write some code in PowerBI!

Setting up PowerBI

Disclaimer: Please keep in mind that my total exposure to PowerBI is < 3 days. The code is not production grade, and probably can be made shorter, clearer, less error-prone etc. But it works and implements some important best practices. PowerQuery M is a functional language, and I am not used to functional programming. So things like code branching etc. may look ugly and can probably made better too. You are welcome to write improvement suggestions in comments!

Here are the general principles of the below code:

  1. Solution should work in the PBI web service w/o having to talk to PBI Desktop application (and the data auto-refresh must work). Code must be written in a certain way for this to work.
  2. I want to minimize the amount of copy-pasted code by using functions, wrappers for error handling etc.
  3. I want to ensure solution portability, so we don’t store parameters such as usernames, hostnames or shared secrets in the code (you won’t believe how many examples in internet have all this hardcoded!).

OK, enough talking – let’s get to doing.

First, we need to get PowerBI Desktop, start a new project and go to the Query Editor by selecting Get Data -> Blank Query. Then, in the Query Editor we want to set up some Groups to make things look like and Parameters to make our solution portable. All these can be created by right-clicking the empty space under “Other Queries”. This is how my setup looks like

The parameters are (all required, all text):

  • INTELTENANT – this is the WS1 Intelligence tenant host (eu1.data.vmwservices.com)
  • OAUTH2_CLIENTID – the client ID from the service account JSON file
  • OAUTH2_CLIENTSECRET – the client secret from the service account JSON file

Having the parameters set up and populated, we can begin writing the code.

Function Get_OAuth2AccessToken

Our first task is to authenticate to WS1I Reports API and retrieve the bearer token that must be passed along all other API calls. This is the only function that uses auth.[tenant] endpoint instead of api.[tenant].

This is the code for it

// Gets the JWT token
// TODO: do proper expiration check on the existing token and only refresh when needed
// I can do the expiration check, but have no idea how to store token in between calls.

() =>
let

    authKey = "Basic " & Binary.ToText(Text.ToBinary(#"OAUTH2_CLIENTID" & ":" & #"OAUTH2_CLIENTSECRET"),0),
    url = "https://auth." & #"INTELTENANT" & "/oauth/token?grant_type=client_credentials",

 GetJson = Web.Contents(url,
     [
         Headers = [#"Authorization"=authKey,
                    #"Content-Type"="application/x-www-form-urlencoded;charset=UTF-8"],
        IsRetry = true , // dirty hack to prevent PQ from caching the expired token
         Content = Text.ToBinary("grant_type=client_credentials") 
     ]
 ),

//TODO: Error processing
 FormatAsJson = Json.Document(GetJson),

 // Gets token from the Json response
 OAuth2_AccessToken = FormatAsJson[access_token]
in
    #"OAuth2_AccessToken"

Most of this code is quite straightforward – we authenticate to the API authentication endpoint (which you will get in the JSON file with service account credentials) using basic auth. We receive a JSON, from which we extract our JWT bearer token.

Important notes:

  • Currently, there is no concept of refresh token in WS1 Reports API, so you need to get a new access token every time the new one expires (one hour).
  • PowerQuery has a tendency of caching queries, which means your request to get a new token may simply be not executed – you will get the old cached (potentially expired) one instead! I spent several hours troubleshooting stupid error prompts before I figured out what is going on! The IsRetry parameter ensures we always get a new token. This is, honestly, a bad way of doing thigs, and I have written a [slightly] better piece of code that checks the current token for validity and only requests new one when the old Token expires. But it is a huge chunk of ugly code that I don’t want to show.
  • Same applies to the error processing. There should be some, but we assume best case scenario or you will get easily lost in piles of code.

The end result is that this function returns a plain text string that is your JWT token to be passed in Authorization header later on.

Function WS1I_ReportAPI_Call

This is a wrapper to WS1I Reports API calls that saves the hassle of sending same parameters over and over. It makes the “business-level” code simpler.

It also does some error processing (in a weird, and honestly saying, half-baked way, but it will give you some ideas). The nuance is that some WS1I APIs require GET and some require POST – if you confuse the two you will most likely get an HTTP 405 “Method not allowed” error. Most APIs are POST, so I’ve created an optional parameter to specify GET request, as well as pagination etc (which in most cases we don’t need).

The function returns a record, with fields Success (true|false), Content (response content or error details) and Metadata.

In the future this function can be enhanced to call specific API methods or to work with named vs positional parameters, put metadata into actual metadata instead of record field etc. but again, the point here is not to learn PowerQuery πŸ™‚ This code is 3 times smaller w/o error checking!

Here is the code

(REQ_URL as text, optional REQ_TYPE as text, optional REQ_body as text, optional IS_RETRY as logical) =>
let

ErrorTable = [
400 = "Bad Request. Query content replaced with error details", // 400 error returns details, everything else does not
401 = "Authentication failed. Likely your access-token needs to be renewed.",
403 = "Authorization failed. You attempted to access a resource or perform an operation that you are not permitted to do.",
404 = "The resource you attempted to access does not exist. Most likely an ID in the URL is not correct",
405 = "Method not allowed. Most likely POST vs GET",
429 = "Rate limit exceeded",
xxx = "Some other HTTP error."
],

OAuth2AccessToken = Get_OAuth2AccessToken(),

ResponseContent = Web.Contents("https://api." & INTELTENANT & "/v1/reports/",
  [
    RelativePath = REQ_URL,
    // assume POST by default
    Content = if (REQ_TYPE = "GET") then null else Text.ToBinary(if REQ_body = null then "{}" else REQ_body),

    Headers = [
      Authorization="Bearer " & OAuth2AccessToken,
      Accept="*/*", 
      #"Content-Type"="application/json"
    ],
    IsRetry = if IS_RETRY = null then false else true,
    ManualStatusHandling={400,401,403,404,429}
  ]
),
    
  ResponseMetadata = Value.Metadata (ResponseContent),
  ResponseCode = ResponseMetadata[Response.Status],
  // if ResponseMetadata(Response.Status) != 2xx return error with details (important that some APIs do 201 etc - not just 200
  // else return content (do not parse since it can be JSON or CSV - we don't know
  IsOK = Text.Start(Number.ToText(ResponseCode),1) = "2",
  ReturnStatus = if IsOK = true then true else false,
  

  // If response is 400 Content should be the JSON array with error description

  ReturnContent = if (IsOK = true) 
    then ResponseContent
    else if ResponseCode <> 400
      then [
        code = ResponseCode,
        message = Record.FieldOrDefault(ErrorTable,Number.ToText(ResponseCode),ErrorTable[xxx]) 
      ]
      else Json.Document(ResponseContent)[errors]{0},

  Result = [
    Success  = ReturnStatus,
    Content  = ReturnContent,
    Metadata = ResponseMetadata // this is just in case for advanced error processing
    ]
in
    Result

Some notes

  • For error checking we need to check the first digit of the response: 2xx is good (not just 200, since some APIs like running a report return a 201, for example), 4xx or 5xx is bad, and for 400 we can do some additional error processing. Purists may argue that dividing by 100 is better than text conversion – your call πŸ™‚
  • I return success status explicitly, since different business logic may decide to handle errors differently (retry vs do something else such as notify the user, etc).
  • Note the usage of Relative-Path parameter vs fully constructing the URL. This is important for auto-refresh to work in the PowerBI web service. Why? – ask MS, but if you don’t do this, it would not work.

Now we can write code to download the report

Function Report_download

The name is self-explanatory. Thanks to our wrapper, the function declaration simply requires one parameter – the report ID (which we looked up from the Intelligence console).

Here is the code. Basically, we make an API call to list downloads, and if successful – we extract the ID ot the newest COMPLETED download and make another call to get the CSV. However, due to functional nature of the language we must use nested if…else blocks instead of shorter “if [error] abort” structures. If you know a better way – please let me know! Again, this code looks a lot simpler w/o error processing.

(REPORT_ID as text) =>
let

// API call to get the list of downloads
Report_DownloadList_URL = REPORT_ID & "/downloads/search",
Report_DownloadList_Response = WS1I_ReportAPI_Call (Report_DownloadList_URL),

Report_DownloadResult = if Report_DownloadList_Response[Success] = false
// Error branch
    then Error.Record ("API call failed",REPORT_ID & "/downloads/search", Json.Document(Report_DownloadList_Response))
    
    else
// Success branch
        let
           // extract the data 
           Report_DownloadList_Data = Json.Document(Report_DownloadList_Response[Content])[data],
            
            // Find the top level result with status COMPLETED and get the ID
            // can be made a lot simpler if not filtering for COMPLETED, but also dangerous πŸ™‚
            Report_DownloadList_Results = Report_DownloadList_Data[results],
            Report_DownloadList_Results_Table = Table.FromList(Report_DownloadList_Results, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
            #"Expanded Column1" = Table.ExpandRecordColumn(Report_DownloadList_Results_Table, "Column1", {"id", "status", "modified_at"}, {"id", "status", "modified_at"}),
            Report_DownloadList_Completed = Table.SelectRows(#"Expanded Column1", each [status] = "COMPLETED"),
            Report_DownloadID = Report_DownloadList_Completed{0}[id],

            // API call to download CSV
            Report_DownloadCSV_URL = "/tracking/" & Report_DownloadID & "/download",
            Report_DownloadCSV_Response = WS1I_ReportAPI_Call (Report_DownloadCSV_URL,"GET"),

            Report_DownloadCSV_Result = if Report_DownloadCSV_Response[Success] = false
// Error branch
                then Error.Record ("API call failed","/tracking/" & Report_DownloadID & "/download" & ":: Report ID:" & REPORT_ID, Json.Document(Report_DownloadCSV_Response))
                else let
// Success branch
                    Report_DownloadCSV_data = Csv.Document(Report_DownloadCSV_Response[Content]),
                    Final_Result = Table.PromoteHeaders(Report_DownloadCSV_data, [PromoteAllScalars=true])
                in Final_Result
            
        in     
           Report_DownloadCSV_Result

in
    Report_DownloadResult

The result is a table converted from CSV with CSV first line promoted to table headers. Having this function we can now quickly author queries to generate the actual datasets.

Generating actual reports in PowerBI

With all the wrappers and functions we have, creating queries to generate the actual reports becomes very simple. Here’s the code it its most basic form

let
    Source = Report_download("375c49a4-8af5-4e5e-96b4-d6b015828a33")
in
    Source

And here is the result

With all the support code in place puling data requires only knowing the report ID

Now you can easily create as many queries as you want to pull as many reports as you want. You can also further enhance the code to do any data massaging required for an individual report (it might be better to do it later in the DAX part of PowerBI), perform custom error processing on business logic level etc. Basically, that’s it! …except for a few things!

Getting connectivity and privacy settings right

Most likely, once you try to pull your first report or issue an API call, you will get an “Enter credentials” or “Specify how to connect” dialog. You may also get various confusing errors related to privacy settings, not being able to mix data from different sources etc. All this boils sond to your data source auth and privacy settings. This is another convoluted topic. Having these settings right is important for publishing to the web service, having auto-refresh to work, portability etc.

Long story short, this is how it should be set up:

  • Auth type: anonymous
  • Privacy level: organizational

Note that thanks to using the Relative Path in our WS1I_ReportAPI_Call wrapper, we only have a single /reports data source to deal with instead of having to configure every single API endpoint (/download, /search, etc) individually!

Validating in PowerBI Desktop and publishing to the web service

Now that our solution works, we can click Close & Apply to actually issue the calls and load our data into the PowerBI data model. In my case I have two reports set up, so I see two datasets in the model view. I also see the Publish button

After clicking Publish you will be prompted for credentials to connect to the cloud PowerBI instance, you will select a workspace and we can move to the web service (browser console) for the final touches).

Now, let’s log into the PowerBI in browser and open our workspace. We should see the dataset and report there. We need to validate the dataset settings and configure auto-refresh.

The settings we care about are Data source credentials, Parameters and Scheduled Refresh.

For Data source credentials you’d want to double check the auth and privacy settings (they should be auto-imported from PBI Desktop), and enable “Skip test connection” – PBI issues authenticated calls, and WS1 api.[tenant] endpoint reacts poorly to them.

The Parameters should also auto-import from PBI Desktop – just check that they were imported correctly, or maybe change them from test to prod credentials etc – this is why using Parameters is important and beneficial.

For Scheduled refresh – just turn it on and ensure that it plays correctly with the report schedule in WS1 Intelligence.

The final bit of testing is clicking Refresh icon on the dataset to ensure everything works. If you get errors you will see a warning triangle sign. Hovering / clicking on it will provide the details. Then compare your setup against the one in this article and call your nearest PowerBI expert for I am not the one!

Your final test is manually refreshing the dataset in PowerBI web service

Otherwise you are done!

Summary and Useful resources

We have looked at:

  • Workspace ONE Intelligence API structure (quickly)
  • Setting up WS1 Intelligence for API access by other systems
  • Coding and configuring PowerBI to access WS1I report data from PowerBI Desktop
  • Publishing the whole thing to PowerBI Web with scheduled data refresh

Hope this will help you with PowerBI or other reporting systems! If you have suggestions on making the PowerQuery code better – they are welcome. As I said, I only had 3 days! πŸ™‚

Useful resources:

That was a long one – congratulate yourself for surviving!

Bonus 01: Exporting as Template

One cool thing about parameters in your PBI project is that you can share it with others w/o exposing your secret keys etc. Simply choose File -> Export as Template and your code will be shared w/o parameter values. Then others will be prompted to enter their parameters upon template import. Easy, yet useful!

Bonus 02: JWT Token expiration date.

The below code decodes the expiration time from the JWT token (UTC) and checks if the Token expired.

    SplitEnc = Text.Split(Token,"."),
    JWT_JSON = Json.Document(Binary.FromText(SplitEnc{1},BinaryEncoding.Base64)),
    expDT = DateTime.From(#datetime(1970, 1, 1, 0, 0, 0) + #duration(0, 0, 0, JWT_JSON[exp])), // exp is in SECONDS, not milliseconds
    exp_UTC = DateTime.AddZone(expDT,0), // Convert to UTC time
    
    TimeNow_UTC = DateTimeZone.UtcNow(), // UTC
    
    TokenExpired = if TimeNow_UTC > exp_UTC then true else false

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Blog at WordPress.com.

Up ↑

%d bloggers like this: