I had a chance to take a look today at Tableau Public. I’ve used it in the past but to be honest Tableau is my third favorite visualization tool (behind Qlik and Power BI). I was impressed to see they used an example mimicking the famous TED talk by Hans Rosling. If you haven’t seen this and you’re interested in data and analytics – its a must watch! 🙂
Click here for a Power BI App about the Avenger’s movies!
Note: If you are prompted for credentials send me an email!
I follow and really like Nathan Yau. His site FlowingData.com is a great resource for Data Visualization inspiration. Below is his reading list for during the crisis:
Books specifically about making and using charts…
- Info We Trust by RJ Andrews — Unique because Andrews hand-drew all of the examples himself.
- Data Visualisation by Andy Kirk — It’s next up with the Datavis Book Club.
- Designer’s Guide to Creating Charts and Diagrams by Nigel Holmes — I bought a used copy a while back for a couple of dollars. I’ve always admired Holmes’ style.
- Wordless Diagrams by Nigel Holmes — Got this one too, pretty much for the price of shipping.
- Elevate the Debate edited by Jonathan A. Schwabish — A practical guide aimed at communicating technical research to a wider audience.
Making sense of numbers…
- Factfulness by Hans Rosling — I’ve heard many good things. Probably first up in my queue.
- Exploratory Data Analysis by John Tukey — It’s an outdated textbook but it’s historically rich. I’ve never read it cover-to-cover.
- How Charts Lie by Alberto Cairo — So important these days.
- Understanding data and statistics in the medical literature by Jeffrey Leek, Lucy D’Agostino McGowan, and Elizabeth Matsui
- R Packages by Hadley Wickham — I know the basics, but I should know more.
- The Book of R by Tilman M. Davies — A big, fat reference.
- Some visualization with Python book. I’ve seen some books, but is there a well-regarded reference?
Outside visualization, but applicable…
- The Shape of Design by Frank Chimero — Got this years and years ago. I will assume it has aged like a fine wine.
- The Design of Everyday Things by Don Norman — Charts are everyday things.
- Emotional Design: Why We Love (or Hate) Everyday Things by Don Norman
- Understanding Comics by Sott McCloud — Telling stories visually. Sounds familiar.
To think about various visual forms…
- History of Information Graphics by Sandra Rendgen and Julius Wiedemann — This is a giant book with giant pictures.
- All Over the Map by Betsy Mason and Greg Miller — The stories behind the visuals always make the picture more interesting.
- How To: Absurd Scientific Advice for Common Real-World Problems by Randall Munroe — I admire Munroe’s ability to explain things with stick figures and clear diagrams.
- Math With Bad Drawings by Ben Orlin — Again, interested in the process more than I am in the material.
A took a quick look at the COVID-19 data using Power BI and Qlik Sense. Both have their advantages – but are using the same dataset. A shared table in Snowflake (CT_US_COVID_TESTS).
This is a difficult time for many of us. My thoughts are first with all of the people suffering from the disease and it’s impact. Also with the heroic first responders, the men and women who are risking their own lives to save others.
There are a number of interesting site I’ve been following to get more info. The site below does a great job of explaining the growth of the virus and how to tell if we are flatting the curve.
The original Johns Hopkins site uses a map delivered by ESRI and ArcGIS to track the progression of the virus:
Yesterday I attended a free workshop put on by Snowflake. The session entitled “Zero to Snowflake in 90 Minutes” provided information on Snowflake’s Architecture, Performance and Scalability as well as a “hands-on” demo. Snowflake touts itself as “The Data Warehouse Built for the Cloud” and is gaining enterprise customers at a dizzying pace.
The “demo” used data from Citi Bike – New York City’s bike share system. Citi Bike is the nations largest bike sharing service. The data can be downloaded from: https://www.citibikenyc.com/system-data
The workshop provides an introduction to how to setup and use Snowflake. The outline is below and the lab takes 90~ minutes:
Module 1: Prepare Your Lab Environment
Module 2: The Snowflake User Interface & Lab “Story”
Module 3: Preparing to Load Data
Module 4: Loading Data
Module 5: Analytical Queries, Results Cache, Cloning
Module 6: Working With Semi-Structured Data, Views, JOIN
Module 7: Using Time Travel
Module 8: Roles Based Access Controls and Account Admin
Module 9: Data Sharing
I found the workshop very interesting and for two reasons. First, it covered all the basics of using a cloud based database. Users loaded data from a S3 bucket, parsing both csv and json files. Queried the database and managed schema’s and security. The second reason why enjoyed the session is because Qlik’s Elif Tutuk used this dataset for a Qlik Sense Demo app.
I found a copy of the old Qlik Demo app and set it up on a Qlik Sense instance.
I created a ODBC connection (using a DSN) and was able to update the data from Snowflake. The combination of Qlik Sense and Snowflake is compelling. I liked the Snowflake demo especially when I could match it up with the visualizations from Qlik Sense.
In late 2018-2019, I worked on a Credit Scoring model. As part of the work I wrote a “whitepaper” outlining the process, methodology and results.
A copy of the whitepaper is available for download: https://www.ericfrayer.com/wp-content/uploads/2019/10/Credit_Scoring_Whitepaper_v1.pdf
As part of the project I used R and SPSS for model construction and verification. The output from the model is available here:
In 2012, I picked up a copy of Teo Lachev’s “Applied Analysis Services.” The book featured how Microsoft was pulling together Excel, Power View, Power Pivot, Tabular Modeling and the new DAX Language (Data Analysis eXpressions). The traditional OLAP SSAS MDX cube wasn’t going away but the new hardware options and increased need for self service meant a new technology was required. DAX uses standard Excel formula syntax. This provided business users with a way to extended Excel logic, formulas and calculations. The power of Excel with the promise of self service BI is pretty compelling.
During my time at Qlik (2012-2017), Microsoft continued to build and expand it’s products. With the Tabular model , Microsoft adopted a “columnstore indexing” strategy using Vertipaq. This allowed for much more data to be available on disk and in-memory.
For more info visit: https://docs.microsoft.com/en-us/analysis-services/
Recently I read a very interesting article on how the “digital world” allows for much more experimentation than previously available. Online marketing and research has changed the discipline. The authors of the article explore some of the “subtleties” of designing experiments and provide guideline.
A copy of the article is available below:
This is just a quick post to share I’m using this site mainly as a “technical sandbox”. Someplace to try out different functionality and post working examples. I’m using Google Analytics in a browser and app on my phone to see if I’m getting any traffic. Most of my “users” are friends, colleagues and potential employers who’ve I’ve given the url.
Anyway here is a page from Google Analytics for my site. I guess it shouldn’t be surprising how much Google provides for developers and internet users.