top of page

Data Analysis Guided Project 1 - Walmart Customer Analysis

In the first follow along guided project for the DataSimple's Ai Enhanced Python Data Analysis Bootcamp we analyze Walmart Customer Data.  When we analyze customer data of a shopping center we want to get a sense of the customer's behaviors,  things that can be done to personalize the shopping experience, further understand how the customers make their buying decisions, and anything that could give us a competitive advantage over our competitors.  To highlight differences we will consider that effect that taking a random sample has on our data and the insights drawn from them to be certain of our analysis before sharing our analysis with our colleagues.

Data Analysis Guided Project 1 - Walmart Customer Analysis

Follow Along

Guided Project

Connect with
Data Science Teacher Brandyn

Review of Python Code 1

for feat in df.columns:
    if df[feat].dtype != 'object'
         df[feat].plot(kind='hist', figsize=(4,3), linewidth=3,
          edgecolor='black', color='#1B6A80',
          title=f'{feat} Distribution')
         plt.show()

I never feel comfortable until I see the distributions of my Data.  This loop is where we get our first glance at the data and get an understanding of the work to done to be able to complete our data analysis.


  • for feat in df.columns:: It iterates through each column in the DataFrame.

  • if df[feat].dtype != 'object':: It checks if the data type of the column is not an object (non-categorical). This is to ensure that histograms are only plotted for numerical features.

  • df[feat].plot(...): It plots a histogram for the selected feature with the specified parameters.kind='hist': Specifies the type of plot as a histogram.
    figsize=(4, 3): Sets the size of the figure (width, height).
    linewidth=3: Sets the line width of the bars in the histogram.
    edgecolor='black': Sets the color of the edges of the bars.
    color='#1B6A80': Sets the fill color of the bars.
    title=f'{feat} Distribution': Sets the title of the plot based on the feature name.

  • plt.show(): Displays the plot for the current feature.


This code essentially creates individual histograms for each numerical feature in the DataFrame and displays them one by one.

Review of Python Code 1

for feat in df.columns:
    if df[feat].dtype != 'object'
         df[feat].plot(kind='hist', figsize=(4,3), linewidth=3,
          edgecolor='black', color='#1B6A80',
          title=f'{feat} Distribution')
         plt.show()

I never feel comfortable until I see the distributions of my Data.  This loop is where we get our first glance at the data and get an understanding of the work to done to be able to complete our data analysis.


  • for feat in df.columns:: It iterates through each column in the DataFrame.

  • if df[feat].dtype != 'object':: It checks if the data type of the column is not an object (non-categorical). This is to ensure that histograms are only plotted for numerical features.

  • df[feat].plot(...): It plots a histogram for the selected feature with the specified parameters.kind='hist': Specifies the type of plot as a histogram.
    figsize=(4, 3): Sets the size of the figure (width, height).
    linewidth=3: Sets the line width of the bars in the histogram.
    edgecolor='black': Sets the color of the edges of the bars.
    color='#1B6A80': Sets the fill color of the bars.
    title=f'{feat} Distribution': Sets the title of the plot based on the feature name.

  • plt.show(): Displays the plot for the current feature.


This code essentially creates individual histograms for each numerical feature in the DataFrame and displays them one by one.

bottom of page