Skip to main content

Cleanliness is next to godliness

My first post shows you how to build a clean report in Excel using a basic design principle in computer science: separation of concerns. The idea here is to separate distinct parts of your workbook into separate tabs including: inputs, mappings, data, pivots, presentation. Here is the finished product of a line chart comparing median prices of 1-family homes in Harlem and the UWS:


Let's get started with the "Inputs" tab. This is where we'll define the inputs that drive the workbook. For this example, let's create Date and Url named ranges. You can define a NamedRange by highlighting the cell and entering the named range in the top-left ComboBox.


Next let's grab the data. For this example, we'll pull home sales data from NYC's OpenData which "is free public data published by New York City agencies and other partners". The nice thing about NYC OpenData is that they provide an OData interface which plugs neatly into Excel. Let's get started by loading the data source via the Data ribbon => Get Data => From Other Sources => From OData Feed.


In the Advanced Editor, build the following Power Query M formula to dynamically pull in the OpenData.


Click "Close & Load" to populate the data tab. I included the additional step of cleaning the "type_of_home" column by creating an adjusted "Type" column with the formula: =SUBSTITUTE([@[type_of_home]],"  "," "). This subtle formula replaces two consecutive spaces with one space.


To continue on that housekeeping thread, I'm going to add a "SuperNeighborhood" column to aggregate similar neighborhoods. Instead of doing this inline, let's create a generic mapping formula that will be used elsewhere. To do this, we'll need to create a "Mappings" tab.


And then we're going to create another rightmost column in the Data tab called "SuperNeighborhood". The formula is complex and I'll explain it in more detail in another post. For now, copy and paste the following formula:


Next step is to create the Pivot tab. Click Insert ribbon => PivotTable and reference the OData query:


Set the pivot table fields; for this example, I filtered for "SuperNeighborhood" Harlem and UWS and aggregated average price by year.


Finally create the report!


Conclusion: this post shows you a clean way of building data-driven reports. The important thing to learn is keeping your tabs clean and separate. This allows you to dynamically change inputs that download the data, refresh the pivots, and then produce the desired report. Building Excel workbooks this way will make your life easier because it will reduce the time needed to maintain and adjust the report while letting you focus on the added value analysis.

Comments

Popular posts from this blog

Mappings to Heaven

Let's dive into the weeds of the mappings formula discussed in my earlier posts. Our goal is to translate fields without any hardcoded string literals or direct references to the fields. By dereferencing formulas, we make them adaptable to change and easy to copy/paste formulas without changing the references. First let's review the following mappings formula: This formula uses the header to "map" another field in its row to a mapped value. The from=>to mappings are defined in the "Mappings" tab. The first inner "Index" function gets the value in the "neighborhood" column. This is in LOC 4 and 7 (used to return the default value). Here is the syntax highlighting from that LOC: "MappingsHeaders" is a named range set to the first row of the "Mappings" tab. The second inner "Index" returns the mapped-from range. This is what the offset in the original formula returned. Here's the "ne...