top of page

DatosConsejos simples para el análisis de datos

Explora en detalle Pandas,nacido en el mar, Yellowbrick, Plotly y Shap, Aprenda cómo hacer hermosos gráficos y cómo extraer información de su análisis de datos.  Un analista de datos necesita proporcionar información a los socios comerciales y a un ingeniero de aprendizaje automático.   Los conocimientos necesarios pueden ser muy diferentes y la comprensión de los datos se utilizará de diferentes maneras.  Perfeccionemos nuestras habilidades de análisis de datos de Python y parcelas connacido en el mar, Pandas, Plotly y Shap.

Python Machine Learning Guided Projects

Explore the many Machine Learning models in Python with Sklearn.  Machine Learning is very powerful is the tasks it can handle.  Let's look at regression and classification problems with Sklearn and use models like LinearRegression, ARDRegression, DecisionTrees, RandomForest, GradientBoosting, and NuSVR.

  Análisis univariante

This starter project is great for those new to Sklearn and machine learning.  Learn how to set up an ML workflow.  Use pandas and seaborn in Python to perform your data analysis.  Then use Sklearn to do the train test split and make your final test predictions.

Python Simple Intructional ML Random Forest Project

In this Python guided project, you can follow along and build your first Simple Random Forest machine-learning model. In this Python project, we will use RandomForestClassifier from Sklearn. In is a good idea when doing an ML workflow to have a simple base model that your more robust model will try and beat. In this situation, Logistic Regression acts like our base model and Random Forest acts like our robust complex model.

Simle Clustering Iris Flowers

Level 1, 23 minutes

Follow along in this free to use Simple Clustering Project.  In this project we use the classic Iris data set for Kmeans clustering with sklearn.  In order to determine the number of centroids we will have in our k means we will use yellowbrick

 elbow method plotting tool  to determine a good amount of clusters.  The is no certain way to say which is is the best number of clusters but that elbow method is a valuable technique to gain a sense of which is the best number of centroids for kmeans.

Coming Soon

Smart Watch Price Prediction

Level 3, 25 minutes

Is this simple ensemble method project we explore all major types of ensemble method is Sklearn like the Random Forest, Gradient Boosting a Bagging Sklearn ML regression models.  It is good practice to try more than one model and the complex interaction that happens during predictions are hard for our human brains to interpret so it's best to experiment and try different models to find the best machine learning model for your dataset.

dataanalysisgp29.jpg

Classic Car MPG

Level 3, 24 minutes

In this Python Regression project, we will be predicting the MPG of classic cars.  Use ensemble methods like RandomForestRegressor,  and GradientBoostingRegressor in the supervised machine learning project.  This is a great beginner Python project to practice machine learning with ensemble methods.

MLgp11.jpg

Credit Card Approvals 

Level 4, 40 minutes

 In this Python project, we will use Sklearn for this supervised classification problem.  We will focus on error analysis in this classification problem. Understanding precision versus recall and why we would want to focus on one versus the other.  Will we be using the error analysis tool in Yellowbrick to try and improve our model's score.

MLgp17.jpg

 In this Python project, we will use Sklearn for this supervised classification problem.  We will focus on error analysis in this classification problem. Understanding precision versus recall and why we would want to focus on one versus the other.  Will we be using the error analysis tool in Yellowbrick to try and improve our model's score.

dataanalysisgp33.jpg

Polish Car Price Regression

Level 5, 40 minutes

Predict the price of cars in Poland.  This supervised learning problem in Python is a regression problem.  In this project with will focus on linear regression techniques including PassiveAgressiveRegressor and ARDRegression models.  Ever wonder which is the best machine learning regression model?

In this project we choose to test a diverse set of ML models in sklearn. Here we use ARDRegression and KNeighborsRegressor as the base model in a BaggingRegressor ensemble method. We also test the Gradient Boosting Regressor and the Random Forest Regressor in this guided Python project.

 

We use these models in a Bayesian grid search to enhance the optimization process by leveraging prior knowledge and incorporating uncertainty estimation. By using a probabilistic approach, Bayesian grid search explores the parameter space more efficiently and effectively than traditional grid searches. It allows for a more informed decision-making process by providing posterior distributions and credible intervals, enabling a deeper understanding of parameter sensitivities and trade-offs.

shaply values

Follow along with this Python Regression Project.  Here we will deal with a common problem in house price predictions.  Too many features and how to choose which to use.  We will use Shaply values to help us determine the real impact of each feature on the final prediction and then which can be removed as they don't help.

spaceship01.jpg

In the second part of the Python Guided Machine Learning Project, the data scientist picks up where the data analyst left off.  We use the data analyst's sights to guide the data scientist.
This is extremely helpful in a team setting so the data scientist can focus on building the model. And as we see there is a lot to try when building a model.  

bottom of page