My Portfolio - Prince Nwaekwu
Github profile : https://github.com/dthatprince
Welcome to my portfolio! This repository showcases my projects and fun stuffs.
About Me
I am Prince Nwaekwu, a Python specialist with expertise in Data Science, Machine Learning, MLOps, and backend microservices using Flask and FastAPI. My work focuses on developing scalable solutions that transform complex data into actionable insights and streamline operations.
I have a proven track record in deploying machine learning models, enhancing data workflows with MLOps, and building robust APIs. My technical skills are complemented by my ability to communicate effectively across technical and non-technical stakeholders, making complex concepts accessible and actionable.
I am committed to continuous learning and applying cutting-edge technologies to solve real-world problems. I am keen to contribute to projects that leverage Python to drive innovation and deliver value.
Let’s explore the possibilities of data and Python together!
Thank you for taking the time to learn more about me and my work!
Projects
FinChain Advisor App
- This application allows users to query financial data efficiently using natural language. This project demonstrates how to integrate a ChromaDB vector store within a LangChain pipeline to build an AI-powered GPT Investment Advisor Q&A agent for querying and analyzing financial data.
Repository: Github Link
Data anonymization for LLMs with OpenAI, LangChain and Microsoft Presidio
- Developing and Exploring Use cases for Reversible Anonymization, Multi-language Anonymization, Question and Answering with privacy protection.
Repository: Github Link
- Description: This project demonstrates the creation of Google Cloud Platform (GCP) infrastructure using Terraform. It specifically sets up a Google Storage Bucket and a BigQuery dataset in the europe-west1 region with the help of a service account for authentication.
Repository: Github Link
Image Object Detection using ImageAI
Anomaly Detection in Time Series
This project explores different methods for anomaly detection in time series data, specifically focusing on CPU utilization data from an AWS EC2 instance. The following techniques are implemented and compared:
Mean Absolute Deviation (MAD), Isolation Forest & Local Outlier Factor (LOF)
PySpark ML Pipeline for Bank Term Deposit Subscription Prediction
Docker-Based Data Ingestion Pipeline
Multivariate Anomaly Detection Using the UCI Thyroid Disease Data Set
- This project demonstrates a comprehensive approach to multivariate anomaly detection using Python within a Jupyter Notebook. It guides users through loading, preprocessing, training, and evaluating models on the UCI Thyroid Disease dataset.
Repository: Github Link
Univariate Anomaly Detection (Machine Learning Methods)
- This project implements univariate anomaly detection using machine learning techniques in Python, within a Jupyter Notebook environment. It focuses on detecting unusual patterns or outliers in single-variable datasets using advanced machine learning methods.
Repository: Github Link
Euro 2024 Parliament Elections: LSTM Text Prediction Model
-
Description: This project utilizes an LSTM (Long Short-Term Memory) model to perform text completion and prediction. The training data is sourced from the Wikipedia page about the 2024 European Parliament elections. The model is designed to predict the next word in a given sequence of text, demonstrating the capabilities of LSTM networks in handling sequential data.
-
Repository: GitHub
GPT-2 Text Generator App
-
Description: This project is a text generator application built using GPT-2, Gradio, and TensorFlow. The app takes a user-provided sentence as input and generates a complete paragraph based on that sentence.
-
Repository: GitHub
AI Blog Post Generator using LLAMA2
- Description: This project is an AI-powered blog post generator using the LLaMA-2 model. The application is built with Streamlit and leverages the LLaMA-2 model to generate blog posts based on user inputs.
Features:
Data Analysis and Personalized Offers for an Effective Marketing Strategy (Completing Soon - this August 2024)
- Description: Leveraging Personalized Offers and Data Analysis for an Effective Marketing Strategy.
- Repository: GitHub
Pricing under Uncertainty - Implied Volatility Model - Risk-Free Interest Rates and Back-Testing (Completing Soon - this August 2024)
- Link is Unavailable right now.
Credit Risk Modeling and Credit Scoring For Financial Lending
- Description: This project offers a comprehensive work on building state-of-the-art credit scoring models tailored specifically for financial lending in the Fintech industry.
Focused on credit risk modeling and understanding the role of data science in the lending industry.
- Repository: GitHub
Forecasting Views for a Medium article using FB Prophet - Time Series
- Description: Forecasting Views for a Medium article using FB Prophet.
- Repository: GitHub
Analyzing Patient’s Medical Appointments
- Description: Exploring Factors Associated with No-Show Appointments in Medical Settings
- Repository: GitHub
Machine Learning Models Analysis using Yellowbrick Visualizers and API
- Description: This is my API documentation for Yellowbrick Visualizers and API! It contains various production-ready visualizers along with code examples of how to use them.
- Repository: GitHub
Analyzing Streaming Service Content with SQL
May 2022 - May 2022May 2022 - May 2022
Exploring the dynamic world of streaming service content analysis using SQL. The focus was on major players in the streaming industry: Netflix, Amazon, Hulu and Disney+.
Project link: DataCamp
Player Retention A/B Testing for Mobile Game
- Description: The goal of this project is to conduct an A/B testing and player retention analysis on the Cookie Cats game dataset to investigate the impact of certain factors on player retention in a game. Player retention is a crucial metric in the gaming industry, as it directly correlates with the success and growth of a game. By understanding the factors that influence player retention, game developers can make informed decisions to optimize player engagement and enhance the overall gaming experience.
- Repository: GitHub
Text Mining - Text Analysis, Classification, and Clustering (R & Python)
- Description: This project focuses on applying text analysis, classification, and clustering methods to a real-world corpus using R and Python. The goal is to clean the data, perform statistical analysis, create a classification model, and apply a clustering method.
- Repository: GitHub
Yelp Business Reviews Sentiment Analysis
- Description: Understanding how your customers or users feel about your business.
Leveraging BeautifulSoup and Python to get that data from the Web, analyzing business reviews, cleaning text data and reviews to get meaningful insights, Lemmatization and calculating sentiment using TextBlob.sentiment()
Global Terrorism Data Analysis and Prediction, Model Deployment using Streamlit
Recommendation System with Python, Machine Learning and AI
- Description: This repository contains the implementation of various recommendation systems using Python and machine learning techniques. The goal of these systems is to provide personalized recommendations to users based on their preferences and historical data.
The implemented recommendation systems include:
- Classification-based Collaborative Filtering Systems - Logistic Regression
- Content-Based Recommenders - Nearest Neighbors Algorithm
- Correlation-Based Recommenders
- Model-based Collaborative Filtering Systems with SVD Matrix Factorization
- Popularity-Based Recommenders
- Repository: GitHub
Automation with Python
Using Python for Automating:
- Reading and writing files
- Organizing directories
- Web scraping with Beautiful Soup
- Automating web browsing with Selenium
- Automating with APIs
- Creating API requests
- Linking API calls
- Repository: GitHub
Human Resource Data Analysis & Churn Prediction
- Description: An analysis of employee data in the dataset “HR_comma_sep.csv” to find out what contributes to employees leaving the company.
- Repository: GitHub
Data Analysis of Uber Trips
University Degree - Gender Gap Analysis
- Description: Analysis of the Gender Gap in University Degrees (STEM)
- Repository: GitHub
House Price Prediction Using TensorFlow - Deep Learning
Customer Segmentation using KMeans - Python
NLP Project - Chatbot Using NLTK
- Description: Built a Chatbot to answer questions concerning global warming. (Python & NLTK)
- Repository: GitHub
Netflix Movies & TV Shows Analysis
- Description: Performing exploratory data analysis(EDA) to confirm if the average duration of movies has been declining.
- Repository: My DataCamp WorkSpace
Google Play Store App and Reviews - Data Analysis and Visualization
- Description: Analysis & Visualization of Google Playstore Data. The Play Store Apps Data has the potential to drive App-making businesses to success. Actionable insights can be drawn for developers to work on and capture the Android market.
- Repository: GitHub
- Repository: My DataCamp WorkSpace
Analyzing the Decline of Movie Durations on Netflix (2011-2020)
- Description: In this project, I aimed to explore the trend of declining movie durations on Netflix from 2011 to 2020. Using a CSV file containing Netflix data and Python’s pandas library, I created a DataFrame and analyzed the average movie durations over the years. I also visualized the data using Matplotlib and Seaborn libraries to better understand the trend and its possible causes. By the end of the project, I had gained insights into the changing landscape of movie durations on Netflix and developed my data manipulation and visualization skills using Python.
- Repository: My DataCamp WorkSpace
Analyzing the Scala Programming Language Repository
I analyzed the development history of the Scala programming language by reading, cleaning, and visualizing data from its real-world project repository.
The dataset included information about pull requests and the files that were modified by each request, which were previously mined and extracted from GitHub. I used Python to read, clean, and visualize the data to identify the individuals who had the most influence on Scala’s development and determine the language experts.
I also explored trends in the development of Scala over time to gain insights into its evolution as an open-source project. By the end of the project, I had developed my data manipulation, visualization, and analysis skills using Python and gained a better understanding of the development of the Scala programming language.
Feel free to reach out to me for collaboration or any inquiries:
- Email: nwaekwuprince@gmail.com
- LinkedIn: Profile
Thank you for visiting my portfolio!