GPT-4 and Code Interpreter are hotter than Barbie!

Happy Summer! I’m looking forward to seeing Oppenheimer and Barbie on vacation next week. Both are summer blockbusters which are hot, hot, hot! Explosive and blowing up everywhere. So… What could be bigger and a “real-life actual” game changer?
Generative Pre-trained Transformer (GPT-4) and Large Language Models (LLM’s).

It’s hard to describe how jaw-dropping Open AI GPT-4 Plus is and for only $20 per month how it can change your life. The ability to load a dataset, run analysis, plot the results, and have the python code available with narrative describing the rationale behind advances statistics is unbelievable. It’s clean, fast, and overall, technically accurate. Note: I’ve executed the python code generated by GPT-4 in my own Jupiter notebook and Visual Studio Code to check the results.

I’ll post additional thoughts on my node.js Azure sandbox but don’t wait for me – go get a subscription and try it out for yourself!

Data Warehouse, Data Lakehouse and Data Mesh

Last year, I read a very interesting blog post by Darwin Schweitzer a Microsoft technologist who discusses how to consider emerging technologies in the context of building sustainable enterprises. The blog posts relates the learning patterns organization adopt to newer data technologies replacing existing capability. The approach covers the strategic, organizational, architectural and technological challenges and changes with scaling enterprise analytics.

Three Horizon Model/Framework – strategic
Data Mesh Sociotechnical Paradigm – organizational
Data Lakehouse Architecture – architectural
Azure Cloud Scale Analytics Platform – technological

Read more by following this link:

https://techcommunity.microsoft.com/t5/data-architecture-blog/bring-vision-to-life-with-three-horizons-data-mesh-data/ba-p/3390414


https://geoparquet.org/ is worth checking out!

Recently, I watched a webinar hosted by TDWI, Databricks and Carto. The topic was Unlocking the Power of Spatial Analysis and Data Lakehouses. A copy of the webinar and the slide deck shared is available here. What I liked about the session was the use of Databricks and a Data Lake to provide Spatial Data. There was also a brief discussion on the role of the Open Geospatial Consortium. This group is working on the specifications for creating a geoparquet file. For anyone with an interest in GIS, Mapping, Data and Analytics this is worth checking out!

https://github.com/opengeospatial/geoparquet

ESRI Tapestry Segmentation

For the last 20 years, ESRI has been capturing geographic and demographic data. In the 90’s Acxiom (and others) came up with Lifestyle Segmentation. ESRI started in 1969 and today maps “everything” down to the household level.

I’m surprised by the accuracy of Tapestry Segmentation. The top three segments in my zip code are: “Top Tier”, “Comfortable Empty Nesters” and “In Style”. Pretty accurate – with one data point. I’m an Empty Nester who would like to be “In Style” or “Top Tier” but feel very lucky and fortunate to be “comfortable”.

The 45243 Zip Code

Here is a link to a PDF with more on Tapestry Segmentation and my personal segment.
Just in case you we’re interested here are the details on “Top Tier” and ‘In Style“.

Nathan Yau’s Reading List

I follow and really like Nathan Yau. His site FlowingData.com is a great resource for Data Visualization inspiration. Below is his reading list for during the crisis:

Making Charts

Books specifically about making and using charts…

Statistics

Making sense of numbers…

Development

Some code…

  • R Packages by Hadley Wickham — I know the basics, but I should know more.
  • The Book of R by Tilman M. Davies — A big, fat reference.
  • Some visualization with Python book. I’ve seen some books, but is there a well-regarded reference?

Design

Outside visualization, but applicable…

Inspiration

To think about various visual forms…